Import one or more data.frame from 'Excel' 'xlsx' format
Usage
readOpenxlsx(
xlsx,
sheet = NULL,
startRow = 1,
startCol = 1,
rows = NULL,
cols = NULL,
check.names = FALSE,
check_header = FALSE,
check_header_n = 10,
verbose = FALSE,
...
)Arguments
- xlsx
characterpath to an 'Excel' file inxlsxformat, compatible withopenxlsx::read.xlsx().- sheet
one of
NULL,character, orintegervector, where:sheet=NULLwill import every sheet;characteris a vector of sheet names; andintegeris a vector of sheet index values. The sheet names are determined withopenxlsx::getSheetNames().- startRow
integerindicating the row number to start importing eachsheet.Note
startRowcan be a vector with lengthlength(sheet), to specify thestartRowfor eachsheet.Note
startRowis ignored whenrowsis defined for the same sheet, to minimize confusion about using both togetheer.
- startCol
integerindicating the first column number to retain after importing eachsheet.Note
startColcan be a vector with lengthlength(sheet), to specify thestartColfor eachsheet.Note
startColis ignored whencolsis defined for the same sheet, to minimize confusion about using both togetheer.
- rows
integervector indicating specific rows to import for eachsheet.To specify different
rowsfor eachsheet, supplyrowsas alistofintegervectors.Note that when
rowsis defined for a sheet, it will be used andstartRowwill be ignored for that same sheet.
- cols
integervector indicating specific column numbers to import for eachsheet.To specify different
colsfor eachsheet, supplycolsas alistofintegervectors.Note that when
colsis defined for a sheet, it will be used andstartColwill be ignored for that same sheet.
- check.names
logicalindicating whether to callmake.names()on thecolnamesof eachdata.frame.Note that
openxlsx::read.xlsx()does not honorcheck.names=FALSE, so a workaround is applied which loads a single line without column headers, in order to obtain the same data without mangling column headers. If this process fails, another workaround is to usestartRow=2(one higher than previous) andcolNames=FALSE.
- check_header
logicalindicating whether to test for presence of header rows, which may be multi-line column headers. Whencheck_header=TRUE, this method simply tests for the presence of rows that havencoldifferent than the remaining rows of data in the given sheet. When header rows are detected, the values are assigned to columndimnamesof thedata.frame.- check_header_n
integernumber of rows to test for header rows, only used whencheck_header=TRUE. This step is intended when the top row(s) contain fewer columns with headers, above actual column headers, for example the first rowc("Sample", "", "", "Lane", ""), and the second rowc("Name", "Type", "Label", "Name", "Type"). In this case the desired output is"Sample_Name","Sample_Type","Sample_Label","Lane_Name","Lane_Type"). This option default isFALSEdue to the number of exceptions seen in real data.- verbose
logicalindicating whether to print verbose output.- ...
additional arguments are passed to
openxlsx::read.xlsx().
Details
This function is equivalent to openxlsx::read.xlsx()
with a few minor additions:
It returns a
listofdata.frameobjects, one persheet.It properly reads the
colnameswithcheck.names=FALSE.
By default this function returns every sheet for a given
xlsx file.
Some useful details:
Empty columns are not skipped during loading, which means a worksheet whose data starts at column 3 will be returned with two empty columns, followed by data from that worksheet. Similarly, any empty columns in the middle of the data in that worksheet will be included in the output.
When both
startRowandrowsare applied,rowstakes priority and will be used instead ofstartRows. In factstartRowswill be definedstartRows <- min(rows)for each relevant worksheet. However, for each worksheet either argument can beNULL.
See also
Other jam export functions:
applyXlsxCategoricalFormat(),
applyXlsxConditionalFormat(),
set_xlsx_colwidths(),
set_xlsx_rowheights(),
writeOpenxlsx()
Examples
# set up a test data.frame
set.seed(123);
lfc <- -3:3 + stats::rnorm(7)/3;
colorSub <- nameVector(
rainbow2(7),
LETTERS[1:7])
df <- data.frame(name=LETTERS[1:7],
int=round(4^(1:7)),
num=(1:7)*4-2 + stats::rnorm(7),
fold=2^abs(lfc)*sign(lfc),
lfc=lfc,
pvalue=10^(-1:-7 + stats::rnorm(7)),
hit=sample(c(-1,0,0,1,1), replace=TRUE, size=7));
df;
#> name int num fold lfc pvalue hit
#> 1 A 4 0.7349388 -9.106049 -3.1868252 2.780730e-02 0
#> 2 B 16 5.3131471 -4.218488 -2.0767258 6.122279e-01 1
#> 3 C 64 9.5543380 -1.395160 -0.4804306 3.146665e-03 -1
#> 4 D 256 15.2240818 1.016424 0.0235028 1.079898e-06 -1
#> 5 E 1024 18.3598138 2.060645 1.0430959 5.027544e-05 0
#> 6 F 4096 22.4007715 5.945047 2.5716883 3.366732e-07 0
#> 7 G 16384 26.1106827 8.898972 3.1536387 8.554139e-09 1
# write to tempfile for examples
if (check_pkg_installed("openxlsx")) {
out_xlsx <- tempfile(pattern="writeOpenxlsx_", fileext=".xlsx")
writeOpenxlsx(x=df,
file=out_xlsx,
sheetName="jamba_test",
append=FALSE);
# now read it back
df_list <- readOpenxlsx(xlsx=out_xlsx);
df_list[[1]]
}
#> name int num fold lfc pvalue hit
#> 1 A 4 0.7349388 -9.106049 -3.1868252 2.780730e-02 0
#> 2 B 16 5.3131471 -4.218488 -2.0767258 6.122279e-01 1
#> 3 C 64 9.5543380 -1.395160 -0.4804306 3.146665e-03 -1
#> 4 D 256 15.2240818 1.016424 0.0235028 1.079898e-06 -1
#> 5 E 1024 18.3598138 2.060645 1.0430959 5.027544e-05 0
#> 6 F 4096 22.4007715 5.945047 2.5716883 3.366732e-07 0
#> 7 G 16384 26.1106827 8.898972 3.1536387 8.554139e-09 1