Jamal module to read and write Excel files.
This package contains macros to
-
open an Excel file (either
xls
orxlsx
format), -
fetch values from the different cells,
-
change the values and setting of different cells,
-
delete sheets, rows, columns and cells, and
-
merge cells.
Jamal will write back the changes to the file or to a new file.
The macros are all named with the xls:
prefix.
This macro can open a file in xls
or xlsx
format.
The file is opened in read-write mode by default.
This can also be enforced by specifying the parop READ
.
The macro returns and empty string, but the opened file will be stored in a macro.
The name of this macro can be specified in a parop, and has a default value xls$:worksheet
.
Since this workbook name is used in all other xls
macros, you do not need to care about this unless you work with multiple worksheets opened the same time.
The following example opens the file README.xlsx
in the same directory as the README.adoc.jam
file.
EMPTY STRING="{@xls:open file=README.xlsx output=READYOU.xlsx}"
{@try! {xls$:worksheet}}
This file exists in the source code structure and part of the documentation. The examples in this document later will use this file.
Note
|
The sampless in this document keep the macros between the individual samples, but the macros are closed if they are closable.
Therefore, individual samples have to open the XLS file individually.
The subsequent samples always open the |
The output of this XLS opening is empty, but the workbook is stored in the macro xls$:worksheet
.
EMPTY STRING=""
'{xls$:worksheet ...' is defined but cannot be used as a macro.
Note that the worksheet is stored with a name as a macro, and among the macros, but it cannot be used as a macro in the text. It stores an XLS workbook object, and not a user defined macro.
Parameter options (parops) can be defined without parentheses.
The (
and )
are optional.)
-
file
(aliases arein
,input
,from
) is the name of the file that contains the XLS workbook. Technically,file
is also an alias; thus a macro having the name offile
will not be considered. This parop must be defined. -
out
(aliases areoutput
,to
) is the name of the file where the XLS workbook is written. This parop is optional. If this parop is defined, then the mode is automatically considered to beWRITE
. In other cases,READ
is the default, and it is an error to specifyREAD
when there is a defined output. -
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
READ
orWRITE
is the mode of the operation. If theout
parop is defined then the mode isWRITE
. This parop is optional and the default value isREAD
.
Will close the workbook.
The name of the macro containing the workbook is the input of the macro.
This is the name specified in other macros as wb
or workbook
parop.
There is no need to call this macro, the workbook is automatically closed as soon as the Jamal script ends.
Will read information from a given cell and return it as the macro result.
You can get the value as well as the content (in the case of formula it is the formula), or other properties of the cell.
The input of the macro can be the cell location, but the cell can also be defined using the parops xls:sheet
, xls:row
and xls:column
.
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
xls:sheet
(aliased assheet
) can identify the sheet that the cell read is in. When the sheet is not defined in any way, then the first sheet is used. The namexls:sheet
can also be used as a macro, thus if you define a macro with this name, its value will be used as the name of the sheet. This feature can be used to set the default sheet is nothing else is defined. This parop is usually used together withxls:row
andxls:col
to define the location of the cell. They can be used to set one as a macro and use the other as a parop iterating through the cells in the same row or column. The usual way is specifying the cell in the macro input the usualSheet!A1
style. -
xls:row
(aliased asrow
) is the row number of the cell. The indexing starts from 0. -
xls:col
(aliased ascol
) is the column number of the cell. The indexing starts from 0. -
there are several options to specify what the result of the macro should be.
-
value
specify you want the value of the cell. This is the default if you do not specity any of these options. The value is usually the same as the content. The difference is in the case of formulas. For formulas the value is the calculated value, the content is the formula itself. -
content
specify this if you want the content of the cell. See the details in the description of thevalue
option. -
type
specify this if you want the type of the cell. -
format
specify this if you want the format of the cell. -
comment
specify this if you want the comment of the cell. -
commentAuthor
specify this if you want the author of the comment of the cell. -
style
specify this if you want the style of the cell. -
hasComment
specify this if you want to know if the cell has a comment. The result istrue
orfalse
. -
isString
specify this if you want to know if the cell is a string. The result istrue
orfalse
. -
isNumeric
specify this if you want to know if the cell is a number. The result istrue
orfalse
. -
isBoolean
specify this if you want to know if the cell is a boolean. The result istrue
orfalse
. -
isFormula
specify this if you want to know if the cell is a formula. The result istrue
orfalse
. -
isBlank
specify this if you want to know if the cell is blank. The result istrue
orfalse
. -
isError
specify this if you want to know if the cell is an error. The result istrue
orfalse
. -
isNull
specify this if you want to know if the cell is null. The result istrue
orfalse
.
-
-
When you specify
style
then the result is a string that contains the style of the cell. You can also specify one of the following parops to get only one specific style property.-
toString
is the default value for style. The result will contain all the style elements -
align
the alignment of the cell. -
border
the border of the cell. -
fill
the fill of the cell. -
dataFormat
the data format of the cell. -
hidden
the hidden property of the cell. -
locked
the locked property of the cell. -
rotation
the rotation of the cell. -
shrinkToFit
the shrink to fit property of the cell. -
verticalAlignment
the vertical alignment of the cell. wrapText the wrap text property of the cell.
-
{@xls:open file=READYOU.xlsx}
{@xls:cell A1}
{@xls:cell (style)A1}
will result in
This is the content of the cell A1.
align=GENERAL, border=NONE, fill=SOLID_FOREGROUND, dataFormat=0, hidden=false, locked=true, rotation=0, shrinkToFit=false, verticalAlignment=BOTTOM, wrapText=false
This macro can be used to set the value of a cell. The input of the macro is the value to be set. The cell is identified by the parameter options. Parameter options are also used to define which part of the cell (value, style etc.) is set.
When setting boolean values the two values true
and false
are used.
If the value is missing or empty, true
value is used.
That way, for example, you can
{@xls:open file=READYOU.xlsx WRITE}
{@xls:set (cell=A1 wrapText)}
which will result, eventually an empty string, but it will modify the cell and
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
xls:sheet
(aliased assheet
) can identify the sheet that the cell read is in. When the sheet is not defined in any way, then the first sheet is used. The namexls:sheet
can also be used as a macro, thus if you define a macro with this name, its value will be used as the name of the sheet. This feature can be used to set the default sheet is nothing else is defined. This parop is usually used together withxls:row
andxls:col
to define the location of the cell. They can be used to set one as a macro and use the other as a parop iterating through the cells in the same row or column. The usual way is specifying the cell in the macro input the usualSheet!A1
style. -
xls:row
(aliased asrow
) is the row number of the cell. The indexing starts from 0. -
xls:col
(aliased ascol
) is the column number of the cell. The indexing starts from 0. -
cell
is the cell reference where the cell is set. Ifcell
is specified, thenrow
andcol
should not be specified. -
One of the following parameters can define what is set in the cell:
-
value
the value of the cell. This is the default. -
formula
the value of the cell as a formula -
format
the format of the cell is set. -
comment
the comment of the cell is set. -
style
the style of the set. When this is set, one of the style parameter options can also be set. -
width
the width of the column -
height
the height of the row
-
-
The type of the cell. The default is
STRING
. The possible values areSTRING
,NUMERIC
,BOOLEAN
,FORMULA
,BLANK
,ERROR
. -
setting the style can use one of the following parameters:
-
align
-
border
-
bottomBorder
-
topBorder
-
leftBorder
-
rightBorder
-
borderColor
-
bottomBorderColor
-
topBorderColor
-
leftBorderColor
-
rightBorderColor
-
fillPattern
-
fillBackgroundColor
-
fillForegroundColor
-
dataFormat
-
hidden
-
locked
-
rotation
-
shrinkToFit
-
verticalAlignment
-
wrapText
-
font
-
zoom
specifies the zoom factor for the sheet. The value is a number that is the percentage of the zoom. It may optionally contain a%
sign at the end.
-
-
author
is the name of the author of the comment. Can only be used when the comment is set.
The macro xls:delete
can be used to delete a cell, a row, a column or a sheet.
The default is to delete a sheet given by the name, if no other option is given.
If the row is defined, then the row will be deleted.
If only the column is defined, then the column will be deleted.
If the cell is defined then the cell will be deleted.
In other cases, or if in doubt you can specify the type of the deletion by the parop
-
SHEET
to delete a sheet, -
ROW
to delete a row, -
COLUMN
orCOL
to delete a column, or -
CELL
to delete a cell.
The following example will delete the second column in the XLSX file.
{@xls:open file=READYOU.xlsx WRITE}
{@xls:delete (COLUMN col=1)}
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
xls:sheet
(aliased assheet
) can identify the sheet that the cell read is in. When the sheet is not defined in any way, then the first sheet is used. The namexls:sheet
can also be used as a macro, thus if you define a macro with this name, its value will be used as the name of the sheet. This feature can be used to set the default sheet is nothing else is defined. This parop is usually used together withxls:row
andxls:col
to define the location of the cell. They can be used to set one as a macro and use the other as a parop iterating through the cells in the same row or column. The usual way is specifying the cell in the macro input the usualSheet!A1
style. -
xls:row
(aliased asrow
) is the row number of the cell. The indexing starts from 0. -
xls:col
(aliased ascol
) is the column number of the cell. The indexing starts from 0. -
cell
is the cell reference that is used to refer to the cell in the rest of the document. -
SHEET
,ROW
,COL
,COLUMN
,CELL
is the type of the object that is to be deleted.
Merge cells in the worksheet. To merge the cells, you have to define either
-
the top, bottom, left, and right cell, or
-
the top left cell and the bottom right cell.
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
sheet
is the name of the sheet where the region is merged. If a sheet is defined, it has to be already in the workbook. If there is no sheet defined anywhere, either as this option or in theregion
then the first sheet is used. -
top
is the top row of the region to merge. Numbering is zero indexed. -
left
is the left column of the region to merge. Numbering is zero indexed. -
bottom
is the bottom row of the region to merge. Numbering is zero indexed. -
right
is the right column of the region to merge. Numbering is zero indexed. -
region
is the region to merge in the formatA1:B2
. The definition may contain the name of the sheet, but the sheet name in the first and second part of the region should be the same. It is enough to define the sheet in the first part, but not only in the second part. If the sheet name is defined assheet
and also here, the two definitions should be identical.
Unmerge merged cells in a worksheet.
The input of the macro is the cell reference of the merged cell.
The cell reference can be any individual cell that is part of the merged cell.
The cell can be defined as a cell reference on the input of the macro but also can be specified using the parops xls:sheet
, xls:row
and xls:column
.
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
xls:sheet
(aliased assheet
) can identify the sheet that the cell read is in. When the sheet is not defined in any way, then the first sheet is used. The namexls:sheet
can also be used as a macro, thus if you define a macro with this name, its value will be used as the name of the sheet. This feature can be used to set the default sheet is nothing else is defined. This parop is usually used together withxls:row
andxls:col
to define the location of the cell. They can be used to set one as a macro and use the other as a parop iterating through the cells in the same row or column. The usual way is specifying the cell in the macro input the usualSheet!A1
style. -
xls:row
(aliased asrow
) is the row number of the cell. The indexing starts from 0. -
xls:col
(aliased ascol
) is the column number of the cell. The indexing starts from 0.
These macros convert row, column and sheet numbers and names to cell reference and back.
The xls:to:cell
macro is used to convert the row and column to the cell reference.
The row and column numbers are specified as parops row
and col
.
You can also specify rowAbsolute
and colAbsolute
to get the absolute cell reference.
The sheet name can be specified using the parop sheet
.
The macros xls:row
, xls:col
, and xls:sheet
are used to convert the cell reference to the row, column, and sheet name.
The input is the cell reference, and the value of the macro is the number of the row, the number of the column, or the name of the sheet.
If the cell reference does not contain a sheet name, the macro xls:sheet
will result in error.
The following example will convert the cell reference A1
to the row and column numbers.
{@xls:row A1}
{@xls:col A1}
results in
0
0
Both the row and the column are 0-based.
The next example uses absolute references:
{@xls:to:cell row=1 col=1}
{@xls:to:cell row=1 col=1 rowAbsolute colAbsolute}
will result in
B2
$B$2
Note
|
xls:sheet , xls:row and xls:col can also be defined as user defined macros.
Those will be used by the parop parsing.
They can be used to define a default sheet, row, or column, which is always the same for the duration of some operation.
The built-in macros are used to convert the values, and though they share the name with the user defined macros, they are in a different namespace.
|
This macro can be used to find a specific cell in a row or in a column.
Using this macro, you can find the first empty, or blank cell, or a cell with a specific value.
The most complex search is finding a cell with a value that evaluating the input of the macro will result in true
.
The return value of the macro is the cell reference of the found cell including the sheet name.
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
xls:sheet
(aliased assheet
) can identify the sheet that the cell read is in. When the sheet is not defined in any way, then the first sheet is used. The namexls:sheet
can also be used as a macro, thus if you define a macro with this name, its value will be used as the name of the sheet. This feature can be used to set the default sheet is nothing else is defined. This parop is usually used together withxls:row
andxls:col
to define the location of the cell. They can be used to set one as a macro and use the other as a parop iterating through the cells in the same row or column. The usual way is specifying the cell in the macro input the usualSheet!A1
style. -
xls:row
(aliased asrow
) is the row number of the cell. The indexing starts from 0. -
xls:col
(aliased ascol
) is the column number of the cell. The indexing starts from 0. -
cell
is the reference to the cell where the search starts. This can be used instead of specifying therow
andcol
parameters -
You can specify what you are looking for. The default is
empty
meaning you want to get the first empty cell. The possible values are-
empty
to find the first empty cell -
blank
to find the first cell that is blank -
string
to find the first cell that contains the specified string -
number
to find the first cell that contains the specified number as a floating point number -
integer
to find the first cell that contains the specified number as an integer -
eval
to find the first cell that contains the specified macro evaluated totrue
. -
regex
to find the first cell that matches the specified regular expression. If the parameter option$
is specified, then the default iseval
and the search is for the macro.
-
-
You can specify where you are looking for the cell. The value can be
-
inRow
to search in the row -
inCol
to search in the column
-
-
reverse
will, as the name implies reverse the search direction. This means that the search starts from the cell specified and goes to the beginning of the row or column. -
$
ormacro
specify a name for a macro. If it is specified, then the macro will search for the cell that contains the input evaluated totrue
. The value of this parameter is the name of the macro that will hold the content of the cell during the evaluation. This macro will be defined before the first cell is examined and redefined before each next cell is examined. Note that the input of the macroxls:find
will be evaluated many times until the cell is found. Be careful not to have side effects in the input. See the examples below. -
limitRow
will limit the search in the rows. The search will go up to this row. For example, if the value is 10, then the search will go up to row 9. The default value is0x100000
, decimal 1,048,576 that is the limit in Excel 2007 and later, or zero when the search is reversed. -
limitCol
will limit the search in the columns. The search will go up to this column. For example, if the value is 10, then the search will go up to column 9 (A
toJ
). The default value is0x4000
, decimal 16,384 that is the limit in Excel 2007 and later, or zero when the search is reversed. -
orElse
is the value that will be returned if the cell is not found. If this parameter is not specified, then the macro will throw an exception if the cell is not found. The value can be any string, there is no check that the value is a valid cell reference.
The following example will find the first empty cell in the first row of the sheet.
{@xls:open file=src/test/resources/findSomething.xlsx}
{@xls:find (inRow col=0 row=2 empty)}
will result in
Sheet1!B3
as the cell A3
is not empty, but the next one, B3
is.
A more complex example is:
{@xls:open READ file=src/test/resources/findSomething.xlsx}
(1) {@xls:find (cell=G30 reverse inCol $=zz limitRow=20 orElse="arbad akarba")t{zz}}
(2) {@xls:find (cell=G1 inCol $=zz limitRow=20 orElse="abraka dabra")t{zz}}
(3) {@xls:find (cell=G1 inCol $=zz)t{zz}}
(4) {@xls:find (cell=A1 empty inRow)}
(5) {@xls:find (cell=A3 empty inRow)}
(6) {@xls:find (cell=G4 blank inCol)}
(7) {@xls:find (cell=G4 empty inCol)}
(8) {@xls:find (cell=G4 blank inCol reverse)}
(9) {@xls:find (cell=G1 inCol string)3}
(A) {@xls:find (cell=G1 inCol number)3.3}
(B) {@xls:find (cell=G1 inCol integer)3}
(C) {@xls:find (cell=G1 inCol number)3}
(D) {@xls:find (cell=G1 inCol regex)\d+}
The XLSX file is part of the Jamal module test resources, and the result depends on the content of this file. To understand the results, you should open the file in Excel and check the content of the cells.
The first example (1) will start at the cell G30
and search in the column for the first cell that is rue
.
The input of the macro is t{zz}
and zz
is defined to be used as a macro name for the content of each searched cell, when the cell content is rue
then t{zz}
will be true
.
The search is limited to row 20, thus the rows 21, 22, … 29 are checked.
None of them contains the text rue
therefore for none of them is the result true
.
In that case an error occurs unless the orElse
parop is defined.
In this case the returned value is arbad akarba
, which is abraka dabra
reversed.
The next search (2) is similar, but the search starts at row 1 and not reversed.
The search (3) is not limited, and it does find the cell G31
with the content rue
.
The rest of the examples demonstrate different searches.
The last example, (D) is a regular expression search.
The cells G1
, G2
, and G3
are empty.
The cell G4
contains 3.3
and the regular expression requires 0
as the first fractional digit.
Finally, the cell G5
contains 3.0
.
The result is:
(1) arbad akarba
(2) abraka dabra
(3) Sheet1!G31
(4) Sheet1!A1
(5) Sheet1!B3
(6) Sheet1!G26
(7) Sheet1!G27
(8) Sheet1!G3
(9) Sheet1!G5
(A) Sheet1!G4
(B) Sheet1!G5
(C) Sheet1!G5
(D) Sheet1!G5
A typical use of this macro is to insert information into the first empty cell in a row. That way, every execution of the document can add information to the next empty cell.
{@xls:open file=log_readme.xlsx WRITE}
{@xls:find (inCol cell=Sheet1!A1 empty)}
{#xls:set (cell={@xls:find (inCol cell=A1 empty)}){@date yyyy-MM-dd hh:mm:ss}}
This macro can be used to get a range of cells.
The input of the macro is the top left cell and the bottom right cell.
The workbook, the sheet and the corners can be defined using parameter options.
The return value of the macro is a list of cell references separated by ,
and |
characters.
This is the format that you can use in the core macro for
.
It is possible to get the list of the cell references and also to get a dual-level list. That way you can iterate over
-
each cell in the range, or
-
each row or each column in the range
using the core for
macro.
Note
|
When using the range macro, there has to be an opened workbook.
Although a simple range, like |
-
workbook
(aliases arewb
) is the name of the workbook that is used to refer to the workbook in the rest of the document. This parop is optional and the default value isxls$:worksheet
. -
sheet
is the name of the sheet where the region is merged. If a sheet is defined, it has to be already in the workbook. If there is no sheet defined anywhere, either as this option or in theregion
then the first sheet is used. -
top
is the top row of the region to merge. Numbering is zero indexed. -
left
is the left column of the region to merge. Numbering is zero indexed. -
bottom
is the bottom row of the region to merge. Numbering is zero indexed. -
right
is the right column of the region to merge. Numbering is zero indexed. -
region
is the region to merge in the formatA1:B2
. The definition may contain the name of the sheet, but the sheet name in the first and second part of the region should be the same. It is enough to define the sheet in the first part, but not only in the second part. If the sheet name is defined assheet
and also here, the two definitions should be identical. -
You can specify either
horizontal
orvertical
direction. The default ishorizontal
. The direction specifies how the range is traversed. If the direction ishorizontal
, then the range is traversed row by row, likeA1,A2,A3,B1,B2,B3,C1,C2,C3
. If the direction isvertical
, then the range is traversed column by column, likeA1,B1,C1,A2,B2,C2,A3,B3,C3
. -
If the
reverse
parameter istrue
, then the range is traversed in reverse order. -
The
$forsep
(aliased asseparator
orsep
) parameter specifies the string inserted between the cell references. The default is,
. -
The
$forsubsep
(aliased assubseparator
orsubsep
) parameter specifies the string inserted between the cell references in a multi-level range. The default is|
. Note that the default separator and subseparator are exactly the values that the core macrofor
uses by default. -
The
single
ormulti
parameter specifies how the range is provided. The default issingle
. In this case, each cell is an individual element. Ifmulti
is specified, then each row (if the direction ishorizontal
) or each column (if the direction isvertical
) is an element. These are separated by the subseparator. That way they can be used in afor
loop with multiple loop variables.
{@xls:open file=README.xlsx}
{@xls:range region=A1:B3}
will result in
README!A1,README!B1,README!A2,README!B2,README!A3,README!B3
The same range listing in vertical-first order:
{@xls:open file=README.xlsx}
{@xls:range region=A1:B3 vertical}
will result in
README!A1,README!A2,README!A3,README!B1,README!B2,README!B3
The netx example is the horizontal listing with multi-level range:
{@xls:open file=README.xlsx}
{@xls:range region=A1:B3 multi}
will result in
README!A1|README!B1,README!A2|README!B2,README!A3|README!B3
Note that the separators between the elements within a row are |
and between the rows are ,
fitting the multi variable loop structure.
The next example shows how we can read the cells in a loop using thr range macro.
Note that the for
loop parameter option evalist
tells the loop to evaluate the list of cells.
That way the iteration is over the cells list returned by the call to {@xls:range region=A1:C6 vertical}
.
The result of the loop is a list of cell reference macros, and the !
will Jamal evaluate this list and return the cell values.
{@xls:open file=README.xlsx}
{!@for [evalist] CellRef in ({@xls:range region=A1:C6 vertical})=CellRef={@xls:cell CellRef}
}
will result in
README!A1=This is the content of the cell A1.
README!A2=
README!A3=
README!A4=
README!A5=
README!A6=
README!B1=
README!B2=
README!B3=
README!B4=second will be deleted
README!B5=
README!B6=
README!C1=
README!C2=
README!C3=
README!C4=
README!C5=
README!C6=third will get to the second column
The next example is a multi-level range.
The region is defined as MULTI!A2:D11
and also the multi
parop is defined.
It will result in a list of lists.
Each list element is a list of cell references in a row separated ny ,
.
These lists are separated by |
characters.
The for
loop will put these into the loop variables Title
, Author
, Publisher
, and Year_of_publishing
.
{@xls:open file=README.xlsx}
{!@for [evalist] (Title,Author,Publisher,Year_of_publishing)
in
({@xls:range multi region=MULTI!A2:D11})=
{@xls:cell Author}: {@xls:cell Title}; {@xls:cell Publisher}, {@xls:cell Year_of_publishing}}
will result in the output
Jane Austen: Pride and Prejudice; T. Egerton, 1813
Herman Melville: Moby-Dick; Harper & Brothers, 1851
Mark Twain: Adventures of Huckleberry Finn; Charles L. Webster and Company, 1885
Oscar Wilde: The Picture of Dorian Gray; Lippincott's Monthly Magazine, 1890
Bram Stoker: Dracula; Archibald Constable and Company, 1897
H.G. Wells: The War of the Worlds; William Heinemann, 1898
Joseph Conrad: Heart of Darkness; Blackwood's Magazine, 1899
L. Frank Baum: The Wonderful Wizard of Oz; George M. Hill Company, 1900
Jack London: The Call of the Wild; Macmillan, 1903
E.M. Forster: A Room with a View; Edward Arnold, 1908
that reflects the content of the README.xlsx
file sheet named MULTI
.