Import one or more data.frame from 'Excel' 'xlsx' format
Usage
readOpenxlsx(
xlsx,
sheet = NULL,
startRow = 1,
startCol = 1,
rows = NULL,
cols = NULL,
check.names = FALSE,
check_header = FALSE,
check_header_n = 10,
verbose = FALSE,
...
)
Arguments
- xlsx
character
path to an 'Excel' file inxlsx
format, compatible withopenxlsx::read.xlsx()
.- sheet
one of
NULL
,character
, orinteger
vector, where:sheet=NULL
will import every sheet;character
is a vector of sheet names; andinteger
is a vector of sheet index values. The sheet names are determined withopenxlsx::getSheetNames()
.- startRow
integer
indicating the row number to start importing eachsheet
.Note
startRow
can be a vector with lengthlength(sheet)
, to specify thestartRow
for eachsheet
.Note
startRow
is ignored whenrows
is defined for the same sheet, to minimize confusion about using both togetheer.
- startCol
integer
indicating the first column number to retain after importing eachsheet
.Note
startCol
can be a vector with lengthlength(sheet)
, to specify thestartCol
for eachsheet
.Note
startCol
is ignored whencols
is defined for the same sheet, to minimize confusion about using both togetheer.
- rows
integer
vector indicating specific rows to import for eachsheet
.To specify different
rows
for eachsheet
, supplyrows
as alist
ofinteger
vectors.Note that when
rows
is defined for a sheet, it will be used andstartRow
will be ignored for that same sheet.
- cols
integer
vector indicating specific column numbers to import for eachsheet
.To specify different
cols
for eachsheet
, supplycols
as alist
ofinteger
vectors.Note that when
cols
is defined for a sheet, it will be used andstartCol
will be ignored for that same sheet.
- check.names
logical
indicating whether to callmake.names()
on thecolnames
of eachdata.frame
.Note that
openxlsx::read.xlsx()
does not honorcheck.names=FALSE
, so a workaround is applied which loads a single line without column headers, in order to obtain the same data without mangling column headers. If this process fails, another workaround is to usestartRow=2
(one higher than previous) andcolNames=FALSE
.
- check_header
logical
indicating whether to test for presence of header rows, which may be multi-line column headers. Whencheck_header=TRUE
, this method simply tests for the presence of rows that havencol
different than the remaining rows of data in the given sheet. When header rows are detected, the values are assigned to columndimnames
of thedata.frame
.- check_header_n
integer
number of rows to test for header rows, only used whencheck_header=TRUE
. This step is intended when the top row(s) contain fewer columns with headers, above actual column headers, for example the first rowc("Sample", "", "", "Lane", "")
, and the second rowc("Name", "Type", "Label", "Name", "Type")
. In this case the desired output is"Sample_Name","Sample_Type","Sample_Label","Lane_Name","Lane_Type")
. This option default isFALSE
due to the number of exceptions seen in real data.- verbose
logical
indicating whether to print verbose output.- ...
additional arguments are passed to
openxlsx::read.xlsx()
.
Details
This function is equivalent to openxlsx::read.xlsx()
with a few minor additions:
It returns a
list
ofdata.frame
objects, one persheet
.It properly reads the
colnames
withcheck.names=FALSE
.
By default this function returns every sheet
for a given
xlsx
file.
Some useful details:
Empty columns are not skipped during loading, which means a worksheet whose data starts at column 3 will be returned with two empty columns, followed by data from that worksheet. Similarly, any empty columns in the middle of the data in that worksheet will be included in the output.
When both
startRow
androws
are applied,rows
takes priority and will be used instead ofstartRows
. In factstartRows
will be definedstartRows <- min(rows)
for each relevant worksheet. However, for each worksheet either argument can beNULL
.
See also
Other jam export functions:
applyXlsxCategoricalFormat()
,
applyXlsxConditionalFormat()
,
set_xlsx_colwidths()
,
set_xlsx_rowheights()
,
writeOpenxlsx()
Examples
# set up a test data.frame
set.seed(123);
lfc <- -3:3 + stats::rnorm(7)/3;
colorSub <- nameVector(
rainbow2(7),
LETTERS[1:7])
df <- data.frame(name=LETTERS[1:7],
int=round(4^(1:7)),
num=(1:7)*4-2 + stats::rnorm(7),
fold=2^abs(lfc)*sign(lfc),
lfc=lfc,
pvalue=10^(-1:-7 + stats::rnorm(7)),
hit=sample(c(-1,0,0,1,1), replace=TRUE, size=7));
df;
#> name int num fold lfc pvalue hit
#> 1 A 4 0.7349388 -9.106049 -3.1868252 2.780730e-02 0
#> 2 B 16 5.3131471 -4.218488 -2.0767258 6.122279e-01 1
#> 3 C 64 9.5543380 -1.395160 -0.4804306 3.146665e-03 -1
#> 4 D 256 15.2240818 1.016424 0.0235028 1.079898e-06 -1
#> 5 E 1024 18.3598138 2.060645 1.0430959 5.027544e-05 0
#> 6 F 4096 22.4007715 5.945047 2.5716883 3.366732e-07 0
#> 7 G 16384 26.1106827 8.898972 3.1536387 8.554139e-09 1
# write to tempfile for examples
if (check_pkg_installed("openxlsx")) {
out_xlsx <- tempfile(pattern="writeOpenxlsx_", fileext=".xlsx")
writeOpenxlsx(x=df,
file=out_xlsx,
sheetName="jamba_test",
append=FALSE);
# now read it back
df_list <- readOpenxlsx(xlsx=out_xlsx);
df_list[[1]]
}
#> name int num fold lfc pvalue hit
#> 1 A 4 0.7349388 -9.106049 -3.1868252 2.780730e-02 0
#> 2 B 16 5.3131471 -4.218488 -2.0767258 6.122279e-01 1
#> 3 C 64 9.5543380 -1.395160 -0.4804306 3.146665e-03 -1
#> 4 D 256 15.2240818 1.016424 0.0235028 1.079898e-06 -1
#> 5 E 1024 18.3598138 2.060645 1.0430959 5.027544e-05 0
#> 6 F 4096 22.4007715 5.945047 2.5716883 3.366732e-07 0
#> 7 G 16384 26.1106827 8.898972 3.1536387 8.554139e-09 1