Sisyphe.core.sisypheSheet
External packages/modules
Numpy, Scientific computing, https://numpy.org/
pandas, data analysis and manipulation tool, https://pandas.pydata.org/
SciPy, fundamental algorithms for scientific computing, https://scipy.org/
- class Sisyphe.core.sisypheSheet.SisypheSheet(*args, **kwargs)
Description
Class derived from pandas DataFrame class. Adds IO methods (xml, matfile) to pandas DataFrame class.
Inheritance
DataFrame -> SisypheSheet
Creation: 10/04/2023 Last revision: 13/12/2024
- createXML(doc: Document, currentnode: Element) None
Write the current SisypheSheet instance attributes to xml document instance.
Parameters
- docminidom.Document
xml document
- currentnodeminidom.Element
xml root node
- classmethod getFilterExt() str
Get SisypheSheet filter used by QFileDialog.getOpenFileName() and QFileDialog.getSaveFileName().
Returns
- str
‘PySisyphe Sheet (.xsheet)’
- classmethod load(filename: str) SisypheSheet
Create a SisypheSheet instance from a PySisyphe Sheet file (.xsheet).
Parameters
- filenamestr
PySisyphe Sheet file name (.xsheet)
Returns
- SisypheSheet
loaded sheet
- classmethod loadCSV(filename: str) SisypheSheet
Create a SisypheSheet instance from a CSV file (.csv).
Parameters
- filenamestr
CSV file name (.csv)
Returns
- SisypheSheet
loaded sheet
- classmethod loadJSON(filename: str) SisypheSheet
Create a SisypheSheet instance from a Json file (.json).
Parameters
- filenamestr
Json file name (.json)
Returns
- SisypheSheet
loaded sheet
- classmethod loadTXT(filename: str, sep: str = '|') SisypheSheet
Create a SisypheSheet instance from a text file (.txt).
Parameters
- filenamestr
text file name (.txt)
- sepstr
optional (default ‘|’), char separator between values
Returns
- SisypheSheet
loaded sheet
- classmethod loadXLSX(filename: str) SisypheSheet
Create a SisypheSheet instance from an Excel file (.xlsx).
Parameters
- filenamestr
Excel file name (.xlsx)
Returns
- SisypheSheet
loaded sheet
- classmethod parseXML(doc: Document) SisypheSheet
Create a SisypheSheet instance from xml document.
Parameters
- docminidom.Document
xml document
Returns
- SisypheSheet
new sheet
- classmethod parseXMLNode(currentnode: Element) SisypheSheet
Create a SisypheSheet instance from xml node.
Parameters
- currentnodeminidom.Element
xml node
Returns
- SisypheSheet
new sheet
- save(filename: str) None
Save the current SisypheSheet instance to a PySisyphe Sheet file (.xsheet).
Parameters
- filenamestr
PySisyphe Sheet file name (.xsheet)
- saveCSV(filename: str) None
Save the current SisypheSheet instance to a CSV file (.csv).
Parameters
- filenamestr
CSV file name (.csv)
- saveJSON(path_or_buf: FilePath | WriteBuffer[bytes] | WriteBuffer[str] | None = None, *, orient: Literal['split', 'records', 'index', 'table', 'columns', 'values'] | None = None, date_format: str | None = None, double_precision: int = 10, force_ascii: bool_t = True, date_unit: TimeUnit = 'ms', default_handler: Callable[[Any], JSONSerializable] | None = None, lines: bool_t = False, compression: CompressionOptions = 'infer', index: bool_t | None = None, indent: int | None = None, storage_options: StorageOptions | None = None, mode: Literal['a', 'w'] = 'w') str | None
Convert the object to a JSON string.
Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps.
Parameters
- path_or_bufstr, path object, file-like object, or None, default None
String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string.
- orientstr
Indication of expected JSON string format.
Series:
default is ‘index’
allowed values are: {‘split’, ‘records’, ‘index’, ‘table’}.
DataFrame:
default is ‘columns’
allowed values are: {‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’}.
The format of the JSON string:
‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
‘records’ : list like [{column -> value}, … , {column -> value}]
‘index’ : dict like {index -> {column -> value}}
‘columns’ : dict like {column -> {index -> value}}
‘values’ : just the values array
‘table’ : dict like {‘schema’: {schema}, ‘data’: {data}}
Describing the data, where data component is like
orient='records'
.
- date_format{None, ‘epoch’, ‘iso’}
Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient. For
orient='table'
, the default is ‘iso’. For all other orients, the default is ‘epoch’.- double_precisionint, default 10
The number of decimal places to use when encoding floating point values. The possible maximal value is 15. Passing double_precision greater than 15 will raise a ValueError.
- force_asciibool, default True
Force encoded string to be ASCII.
- date_unitstr, default ‘ms’ (milliseconds)
The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for second, millisecond, microsecond, and nanosecond respectively.
- default_handlercallable, default None
Handler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serialisable object.
- linesbool, default False
If ‘orient’ is ‘records’ write out line-delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list-like.
- compressionstr or dict, default ‘infer’
For on-the-fly compression of the output data. If ‘infer’ and ‘path_or_buf’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to
None
for no compression. Can also be a dict with key'method'
set to one of {'zip'
,'gzip'
,'bz2'
,'zstd'
,'xz'
,'tar'
} and other key-value pairs are forwarded tozipfile.ZipFile
,gzip.GzipFile
,bz2.BZ2File
,zstandard.ZstdCompressor
,lzma.LZMAFile
ortarfile.TarFile
, respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive:compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}
.Added in version 1.5.0: Added support for .tar files.
Changed in version 1.4.0: Zstandard support.
- indexbool or None, default None
The index is only used when ‘orient’ is ‘split’, ‘index’, ‘column’, or ‘table’. Of these, ‘index’ and ‘column’ do not support index=False.
- indentint, optional
Length of whitespace used to indent each record.
- storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to
urllib.request.Request
as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded tofsspec.open
. Please seefsspec
andurllib
for more details, and for more examples on storage options refer here.- modestr, default ‘w’ (writing)
Specify the IO mode for output when supplying a path_or_buf. Accepted args are ‘w’ (writing) and ‘a’ (append) only. mode=’a’ is only supported when lines is True and orient is ‘records’.
Returns
- None or str
If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None.
See Also
read_json : Convert a JSON string to pandas object.
Notes
The behavior of
indent=0
varies from the stdlib, which does not indent the output but does insert newlines. Currently,indent=0
and the defaultindent=None
are equivalent in pandas, though this may change in a future release.orient='table'
contains a ‘pandas_version’ field under ‘schema’. This stores the version of pandas used in the latest revision of the schema.Examples
>>> from json import loads, dumps >>> df = pd.DataFrame( ... [["a", "b"], ["c", "d"]], ... index=["row 1", "row 2"], ... columns=["col 1", "col 2"], ... )
>>> result = df.to_json(orient="split") >>> parsed = loads(result) >>> dumps(parsed, indent=4) { "columns": [ "col 1", "col 2" ], "index": [ "row 1", "row 2" ], "data": [ [ "a", "b" ], [ "c", "d" ] ] }
Encoding/decoding a Dataframe using
'records'
formatted JSON. Note that index labels are not preserved with this encoding.>>> result = df.to_json(orient="records") >>> parsed = loads(result) >>> dumps(parsed, indent=4) [ { "col 1": "a", "col 2": "b" }, { "col 1": "c", "col 2": "d" } ]
Encoding/decoding a Dataframe using
'index'
formatted JSON:>>> result = df.to_json(orient="index") >>> parsed = loads(result) >>> dumps(parsed, indent=4) { "row 1": { "col 1": "a", "col 2": "b" }, "row 2": { "col 1": "c", "col 2": "d" } }
Encoding/decoding a Dataframe using
'columns'
formatted JSON:>>> result = df.to_json(orient="columns") >>> parsed = loads(result) >>> dumps(parsed, indent=4) { "col 1": { "row 1": "a", "row 2": "c" }, "col 2": { "row 1": "b", "row 2": "d" } }
Encoding/decoding a Dataframe using
'values'
formatted JSON:>>> result = df.to_json(orient="values") >>> parsed = loads(result) >>> dumps(parsed, indent=4) [ [ "a", "b" ], [ "c", "d" ] ]
Encoding with Table Schema:
>>> result = df.to_json(orient="table") >>> parsed = loads(result) >>> dumps(parsed, indent=4) { "schema": { "fields": [ { "name": "index", "type": "string" }, { "name": "col 1", "type": "string" }, { "name": "col 2", "type": "string" } ], "primaryKey": [ "index" ], "pandas_version": "1.4.0" }, "data": [ { "index": "row 1", "col 1": "a", "col 2": "b" }, { "index": "row 2", "col 1": "c", "col 2": "d" } ] }
- saveLATEX(buf: FilePath | WriteBuffer[str] | None = None, *, columns: Sequence[Hashable] | None = None, header: bool_t | SequenceNotStr[str] = True, index: bool_t = True, na_rep: str = 'NaN', formatters: FormattersType | None = None, float_format: FloatFormatType | None = None, sparsify: bool_t | None = None, index_names: bool_t = True, bold_rows: bool_t = False, column_format: str | None = None, longtable: bool_t | None = None, escape: bool_t | None = None, encoding: str | None = None, decimal: str = '.', multicolumn: bool_t | None = None, multicolumn_format: str | None = None, multirow: bool_t | None = None, caption: str | tuple[str, str] | None = None, label: str | None = None, position: str | None = None) str | None
Render object to a LaTeX tabular, longtable, or nested table.
Requires
\usepackage{{booktabs}}
. The output can be copy/pasted into a main LaTeX document or read from an external file with\input{{table.tex}}
.Changed in version 2.0.0: Refactored to use the Styler implementation via jinja2 templating.
Parameters
- bufstr, Path or StringIO-like, optional, default None
Buffer to write to. If None, the output is returned as a string.
- columnslist of label, optional
The subset of columns to write. Writes all columns by default.
- headerbool or list of str, default True
Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
- indexbool, default True
Write row names (index).
- na_repstr, default ‘NaN’
Missing data representation.
- formatterslist of functions or dict of {{str: function}}, optional
Formatter functions to apply to columns’ elements by position or name. The result of each function must be a unicode string. List must be of length equal to the number of columns.
- float_formatone-parameter function or str, optional, default None
Formatter for floating point numbers. For example
float_format="%.2f"
andfloat_format="{{:0.2f}}".format
will both result in 0.1234 being formatted as 0.12.- sparsifybool, optional
Set to False for a DataFrame with a hierarchical index to print every multiindex key at each row. By default, the value will be read from the config module.
- index_namesbool, default True
Prints the names of the indexes.
- bold_rowsbool, default False
Make the row labels bold in the output.
- column_formatstr, optional
The columns format as specified in LaTeX table format e.g. ‘rcl’ for 3 columns. By default, ‘l’ will be used for all columns except columns of numbers, which default to ‘r’.
- longtablebool, optional
Use a longtable environment instead of tabular. Requires adding a usepackage{{longtable}} to your LaTeX preamble. By default, the value will be read from the pandas config module, and set to True if the option
styler.latex.environment
is “longtable”.Changed in version 2.0.0: The pandas option affecting this argument has changed.
- escapebool, optional
By default, the value will be read from the pandas config module and set to True if the option
styler.format.escape
is “latex”. When set to False prevents from escaping latex special characters in column names.Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to False.
- encodingstr, optional
A string representing the encoding to use in the output file, defaults to ‘utf-8’.
- decimalstr, default ‘.’
Character recognized as decimal separator, e.g. ‘,’ in Europe.
- multicolumnbool, default True
Use multicolumn to enhance MultiIndex columns. The default will be read from the config module, and is set as the option
styler.sparse.columns
.Changed in version 2.0.0: The pandas option affecting this argument has changed.
- multicolumn_formatstr, default ‘r’
The alignment for multicolumns, similar to column_format The default will be read from the config module, and is set as the option
styler.latex.multicol_align
.Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to “r”.
- multirowbool, default True
Use multirow to enhance MultiIndex rows. Requires adding a usepackage{{multirow}} to your LaTeX preamble. Will print centered labels (instead of top-aligned) across the contained rows, separating groups via clines. The default will be read from the pandas config module, and is set as the option
styler.sparse.index
.Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to True.
- captionstr or tuple, optional
Tuple (full_caption, short_caption), which results in
\caption[short_caption]{{full_caption}}
; if a single string is passed, no short caption will be set.- labelstr, optional
The LaTeX label to be placed inside
\label{{}}
in the output. This is used with\ref{{}}
in the main.tex
file.- positionstr, optional
The LaTeX positional argument for tables, to be placed after
\begin{{}}
in the output.
Returns
- str or None
If buf is None, returns the result as a string. Otherwise returns None.
See Also
- io.formats.style.Styler.to_latexRender a DataFrame to LaTeX
with conditional formatting.
- DataFrame.to_stringRender a DataFrame to a console-friendly
tabular output.
DataFrame.to_html : Render a DataFrame as an HTML table.
Notes
As of v2.0.0 this method has changed to use the Styler implementation as part of
Styler.to_latex()
viajinja2
templating. This means thatjinja2
is a requirement, and needs to be installed, for this method to function. It is advised that users switch to using Styler, since that implementation is more frequently updated and contains much more flexibility with the output.Examples
Convert a general DataFrame to LaTeX with formatting:
>>> df = pd.DataFrame(dict(name=['Raphael', 'Donatello'], ... age=[26, 45], ... height=[181.23, 177.65])) >>> print(df.to_latex(index=False, ... formatters={"name": str.upper}, ... float_format="{:.1f}".format, ... )) \begin{tabular}{lrr} \toprule name & age & height \\ \midrule RAPHAEL & 26 & 181.2 \\ DONATELLO & 45 & 177.7 \\ \bottomrule \end{tabular}
- saveMATFILE(filename: str) None
Save the current SisypheSheet instance to a Matlab file (.mat).
Parameters
- filenamestr
Matlab file name (.mat)
- saveNPY(filename: str) None
Save the current SisypheSheet instance to a Numpy binary file (.npy).
Parameters
- filenamestr
Numpy binary file name (.npy)
- saveNPZ(filename: str) None
Save the current SisypheSheet instance to a Numpy binary compressed file (.npz).
Parameters
- filenamestr
Numpy binary compressed file name (.npz)
- saveTXT(filename: str, sep: str = '|') None
Save the current SisypheSheet instance to a text file (.txt).
Parameters
- filenamestr
text file name (.txt)
- sepstr
optional (default ‘|’), char separator between values
- saveXLSX(excel_writer: FilePath | WriteExcelBuffer | ExcelWriter, *, sheet_name: str = 'Sheet1', na_rep: str = '', float_format: str | None = None, columns: Sequence[Hashable] | None = None, header: Sequence[Hashable] | bool_t = True, index: bool_t = True, index_label: IndexLabel | None = None, startrow: int = 0, startcol: int = 0, engine: Literal['openpyxl', 'xlsxwriter'] | None = None, merge_cells: bool_t = True, inf_rep: str = 'inf', freeze_panes: tuple[int, int] | None = None, storage_options: StorageOptions | None = None, engine_kwargs: dict[str, Any] | None = None) None
Write object to an Excel sheet.
To write a single object to an Excel .xlsx file it is only necessary to specify a target file name. To write to multiple sheets it is necessary to create an ExcelWriter object with a target file name, and specify a sheet in the file to write to.
Multiple sheets may be written to by specifying unique sheet_name. With all data written to the file it is necessary to save the changes. Note that creating an ExcelWriter object with a file name that already exists will result in the contents of the existing file being erased.
Parameters
- excel_writerpath-like, file-like, or ExcelWriter object
File path or existing ExcelWriter.
- sheet_namestr, default ‘Sheet1’
Name of sheet which will contain DataFrame.
- na_repstr, default ‘’
Missing data representation.
- float_formatstr, optional
Format string for floating point numbers. For example
float_format="%.2f"
will format 0.1234 to 0.12.- columnssequence or list of str, optional
Columns to write.
- headerbool or list of str, default True
Write out the column names. If a list of string is given it is assumed to be aliases for the column names.
- indexbool, default True
Write row names (index).
- index_labelstr or sequence, optional
Column label for index column(s) if desired. If not specified, and header and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
- startrowint, default 0
Upper left cell row to dump data frame.
- startcolint, default 0
Upper left cell column to dump data frame.
- enginestr, optional
Write engine to use, ‘openpyxl’ or ‘xlsxwriter’. You can also set this via the options
io.excel.xlsx.writer
orio.excel.xlsm.writer
.- merge_cellsbool, default True
Write MultiIndex and Hierarchical Rows as merged cells.
- inf_repstr, default ‘inf’
Representation for infinity (there is no native representation for infinity in Excel).
- freeze_panestuple of int (length 2), optional
Specifies the one-based bottommost row and rightmost column that is to be frozen.
- storage_optionsdict, optional
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to
urllib.request.Request
as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded tofsspec.open
. Please seefsspec
andurllib
for more details, and for more examples on storage options refer here.Added in version 1.2.0.
- engine_kwargsdict, optional
Arbitrary keyword arguments passed to excel engine.
See Also
to_csv : Write DataFrame to a comma-separated values (csv) file. ExcelWriter : Class for writing DataFrame objects into excel sheets. read_excel : Read an Excel file into a pandas DataFrame. read_csv : Read a comma-separated values (csv) file into DataFrame. io.formats.style.Styler.to_excel : Add styles to Excel sheet.
Notes
For compatibility with
to_csv()
, to_excel serializes lists and dicts to strings before writing.Once a workbook has been saved it is not possible to write further data without rewriting the whole workbook.
Examples
Create, write to and save a workbook:
>>> df1 = pd.DataFrame([['a', 'b'], ['c', 'd']], ... index=['row 1', 'row 2'], ... columns=['col 1', 'col 2']) >>> df1.to_excel("output.xlsx")
To specify the sheet name:
>>> df1.to_excel("output.xlsx", ... sheet_name='Sheet_name_1')
If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object:
>>> df2 = df1.copy() >>> with pd.ExcelWriter('output.xlsx') as writer: ... df1.to_excel(writer, sheet_name='Sheet_name_1') ... df2.to_excel(writer, sheet_name='Sheet_name_2')
ExcelWriter can also be used to append to an existing Excel file:
>>> with pd.ExcelWriter('output.xlsx', ... mode='a') as writer: ... df1.to_excel(writer, sheet_name='Sheet_name_3')
To set the library that is used to write the Excel file, you can pass the engine keyword (the default engine is automatically chosen depending on the file extension):
>>> df1.to_excel('output1.xlsx', engine='xlsxwriter')
- toClipboard(*, excel: bool = True, sep: str | None = None, **kwargs) None
Copy object to the system clipboard.
Write a text representation of object to the system clipboard. This can be pasted into Excel, for example.
Parameters
- excelbool, default True
Produce output in a csv format for easy pasting into excel.
True, use the provided separator for csv pasting.
False, write a string representation of the object to the clipboard.
- sepstr, default
'\t'
Field delimiter.
- **kwargs
These parameters will be passed to DataFrame.to_csv.
See Also
- DataFrame.to_csvWrite a DataFrame to a comma-separated values
(csv) file.
read_clipboard : Read text from clipboard and pass to read_csv.
Notes
Requirements for your platform.
Linux : xclip, or xsel (with PyQt4 modules)
Windows : none
macOS : none
This method uses the processes developed for the package pyperclip. A solution to render any output string format is given in the examples.
Examples
Copy the contents of a DataFrame to the clipboard.
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['A', 'B', 'C'])
>>> df.to_clipboard(sep=',') ... # Wrote the following to the system clipboard: ... # ,A,B,C ... # 0,1,2,3 ... # 1,4,5,6
We can omit the index by passing the keyword index and setting it to false.
>>> df.to_clipboard(sep=',', index=False) ... # Wrote the following to the system clipboard: ... # A,B,C ... # 1,2,3 ... # 4,5,6
Using the original pyperclip package for any string output format.
import pyperclip html = df.style.to_html() pyperclip.copy(html)