Sisyphe.core.sisypheSheet

External packages/modules

Numpy, Scientific computing, https://numpy.org/

pandas, data analysis and manipulation tool, https://pandas.pydata.org/

SciPy, fundamental algorithms for scientific computing, https://scipy.org/

class Sisyphe.core.sisypheSheet.SisypheSheet(*args, **kwargs)

Description

Class derived from pandas DataFrame class. Adds IO methods (xml, matfile) to pandas DataFrame class.

Inheritance

DataFrame -> SisypheSheet

Creation: 10/04/2023 Last revision: 13/12/2024

createXML(doc: Document, currentnode: Element) → None

Write the current SisypheSheet instance attributes to xml document instance.

Parameters

docminidom.Document: xml document
currentnodeminidom.Element: xml root node

classmethod getFileExt() → str

Get SisypheSheet file extension.

Returns

str: ‘.xsheet’

classmethod getFilterExt() → str

Get SisypheSheet filter used by QFileDialog.getOpenFileName() and QFileDialog.getSaveFileName().

Returns

str: ‘PySisyphe Sheet (.xsheet)’

classmethod load(filename: str) → SisypheSheet

Create a SisypheSheet instance from a PySisyphe Sheet file (.xsheet).

Parameters

filenamestr: PySisyphe Sheet file name (.xsheet)

Returns

SisypheSheet: loaded sheet

classmethod loadCSV(filename: str) → SisypheSheet

Create a SisypheSheet instance from a CSV file (.csv).

Parameters

filenamestr: CSV file name (.csv)

Returns

SisypheSheet: loaded sheet

classmethod loadJSON(filename: str) → SisypheSheet

Create a SisypheSheet instance from a Json file (.json).

Parameters

filenamestr: Json file name (.json)

Returns

SisypheSheet: loaded sheet

classmethod loadTXT(filename: str, sep: str = '|') → SisypheSheet

Create a SisypheSheet instance from a text file (.txt).

Parameters

filenamestr: text file name (.txt)
sepstr: optional (default ‘|’), char separator between values

Returns

SisypheSheet: loaded sheet

classmethod loadXLSX(filename: str) → SisypheSheet

Create a SisypheSheet instance from an Excel file (.xlsx).

Parameters

filenamestr: Excel file name (.xlsx)

Returns

SisypheSheet: loaded sheet

classmethod parseXML(doc: Document) → SisypheSheet

Create a SisypheSheet instance from xml document.

Parameters

docminidom.Document: xml document

Returns

SisypheSheet: new sheet

classmethod parseXMLNode(currentnode: Element) → SisypheSheet

Create a SisypheSheet instance from xml node.

Parameters

currentnodeminidom.Element: xml node

Returns

SisypheSheet: new sheet

save(filename: str) → None

Save the current SisypheSheet instance to a PySisyphe Sheet file (.xsheet).

Parameters

filenamestr: PySisyphe Sheet file name (.xsheet)

saveCSV(filename: str) → None

Save the current SisypheSheet instance to a CSV file (.csv).

Parameters

filenamestr: CSV file name (.csv)

saveJSON(path_or_buf: FilePath | WriteBuffer[bytes] | WriteBuffer[str] | None = None, *, orient: Literal['split', 'records', 'index', 'table', 'columns', 'values'] | None = None, date_format: str | None = None, double_precision: int = 10, force_ascii: bool_t = True, date_unit: TimeUnit = 'ms', default_handler: Callable[[Any], JSONSerializable] | None = None, lines: bool_t = False, compression: CompressionOptions = 'infer', index: bool_t | None = None, indent: int | None = None, storage_options: StorageOptions | None = None, mode: Literal['a', 'w'] = 'w') → str | None

Convert the object to a JSON string.

Note NaN’s and None will be converted to null and datetime objects will be converted to UNIX timestamps.

Parameters

path_or_bufstr, path object, file-like object, or None, default None

String, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string.

orientstr

Indication of expected JSON string format.

Series:
- default is ‘index’
- allowed values are: {‘split’, ‘records’, ‘index’, ‘table’}.
DataFrame:
- default is ‘columns’
- allowed values are: {‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’}.
The format of the JSON string:
- ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]}
- ‘records’ : list like [{column -> value}, … , {column -> value}]
- ‘index’ : dict like {index -> {column -> value}}
- ‘columns’ : dict like {column -> {index -> value}}
- ‘values’ : just the values array
- ‘table’ : dict like {‘schema’: {schema}, ‘data’: {data}}
Describing the data, where data component is like orient='records'.

date_format{None, ‘epoch’, ‘iso’}

Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient. For orient='table', the default is ‘iso’. For all other orients, the default is ‘epoch’.

double_precisionint, default 10

The number of decimal places to use when encoding floating point values. The possible maximal value is 15. Passing double_precision greater than 15 will raise a ValueError.

force_asciibool, default True

Force encoded string to be ASCII.

date_unitstr, default ‘ms’ (milliseconds)

The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’, ‘ns’ for second, millisecond, microsecond, and nanosecond respectively.

default_handlercallable, default None

Handler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serialisable object.

linesbool, default False

If ‘orient’ is ‘records’ write out line-delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list-like.

compressionstr or dict, default ‘infer’

For on-the-fly compression of the output data. If ‘infer’ and ‘path_or_buf’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to None for no compression. Can also be a dict with key 'method' set to one of {'zip', 'gzip', 'bz2', 'zstd', 'xz', 'tar'} and other key-value pairs are forwarded to zipfile.ZipFile, gzip.GzipFile, bz2.BZ2File, zstandard.ZstdCompressor, lzma.LZMAFile or tarfile.TarFile, respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}.

Added in version 1.5.0: Added support for .tar files.

Changed in version 1.4.0: Zstandard support.

indexbool or None, default None

The index is only used when ‘orient’ is ‘split’, ‘index’, ‘column’, or ‘table’. Of these, ‘index’ and ‘column’ do not support index=False.

indentint, optional

Length of whitespace used to indent each record.

storage_optionsdict, optional

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

modestr, default ‘w’ (writing)

Specify the IO mode for output when supplying a path_or_buf. Accepted args are ‘w’ (writing) and ‘a’ (append) only. mode=’a’ is only supported when lines is True and orient is ‘records’.

Returns

None or str: If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None.

Notes

The behavior of indent=0 varies from the stdlib, which does not indent the output but does insert newlines. Currently, indent=0 and the default indent=None are equivalent in pandas, though this may change in a future release.

orient='table' contains a ‘pandas_version’ field under ‘schema’. This stores the version of pandas used in the latest revision of the schema.

Examples

>>> from json import loads, dumps
>>> df = pd.DataFrame(
...     [["a", "b"], ["c", "d"]],
...     index=["row 1", "row 2"],
...     columns=["col 1", "col 2"],
... )

>>> result = df.to_json(orient="split")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "columns": [
        "col 1",
        "col 2"
    ],
    "index": [
        "row 1",
        "row 2"
    ],
    "data": [
        [
            "a",
            "b"
        ],
        [
            "c",
            "d"
        ]
    ]
}

Encoding/decoding a Dataframe using 'records' formatted JSON. Note that index labels are not preserved with this encoding.

>>> result = df.to_json(orient="records")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
[
    {
        "col 1": "a",
        "col 2": "b"
    },
    {
        "col 1": "c",
        "col 2": "d"
    }
]

Encoding/decoding a Dataframe using 'index' formatted JSON:

>>> result = df.to_json(orient="index")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "row 1": {
        "col 1": "a",
        "col 2": "b"
    },
    "row 2": {
        "col 1": "c",
        "col 2": "d"
    }
}

Encoding/decoding a Dataframe using 'columns' formatted JSON:

>>> result = df.to_json(orient="columns")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "col 1": {
        "row 1": "a",
        "row 2": "c"
    },
    "col 2": {
        "row 1": "b",
        "row 2": "d"
    }
}

Encoding/decoding a Dataframe using 'values' formatted JSON:

>>> result = df.to_json(orient="values")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
[
    [
        "a",
        "b"
    ],
    [
        "c",
        "d"
    ]
]

Encoding with Table Schema:

>>> result = df.to_json(orient="table")
>>> parsed = loads(result)
>>> dumps(parsed, indent=4)  
{
    "schema": {
        "fields": [
            {
                "name": "index",
                "type": "string"
            },
            {
                "name": "col 1",
                "type": "string"
            },
            {
                "name": "col 2",
                "type": "string"
            }
        ],
        "primaryKey": [
            "index"
        ],
        "pandas_version": "1.4.0"
    },
    "data": [
        {
            "index": "row 1",
            "col 1": "a",
            "col 2": "b"
        },
        {
            "index": "row 2",
            "col 1": "c",
            "col 2": "d"
        }
    ]
}

Render object to a LaTeX tabular, longtable, or nested table.

Requires \usepackage{{booktabs}}. The output can be copy/pasted into a main LaTeX document or read from an external file with \input{{table.tex}}.

Changed in version 2.0.0: Refactored to use the Styler implementation via jinja2 templating.

Parameters

bufstr, Path or StringIO-like, optional, default None: Buffer to write to. If None, the output is returned as a string.
columnslist of label, optional: The subset of columns to write. Writes all columns by default.
headerbool or list of str, default True: Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
indexbool, default True: Write row names (index).
na_repstr, default ‘NaN’: Missing data representation.
formatterslist of functions or dict of {{str: function}}, optional: Formatter functions to apply to columns’ elements by position or name. The result of each function must be a unicode string. List must be of length equal to the number of columns.
float_formatone-parameter function or str, optional, default None: Formatter for floating point numbers. For example float_format="%.2f" and float_format="{{:0.2f}}".format will both result in 0.1234 being formatted as 0.12.
sparsifybool, optional: Set to False for a DataFrame with a hierarchical index to print every multiindex key at each row. By default, the value will be read from the config module.
index_namesbool, default True: Prints the names of the indexes.
bold_rowsbool, default False: Make the row labels bold in the output.
column_formatstr, optional: The columns format as specified in LaTeX table format e.g. ‘rcl’ for 3 columns. By default, ‘l’ will be used for all columns except columns of numbers, which default to ‘r’.
longtablebool, optional: Use a longtable environment instead of tabular. Requires adding a usepackage{{longtable}} to your LaTeX preamble. By default, the value will be read from the pandas config module, and set to True if the option styler.latex.environment is “longtable”.

Changed in version 2.0.0: The pandas option affecting this argument has changed.
escapebool, optional: By default, the value will be read from the pandas config module and set to True if the option styler.format.escape is “latex”. When set to False prevents from escaping latex special characters in column names.

Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to False.
encodingstr, optional: A string representing the encoding to use in the output file, defaults to ‘utf-8’.
decimalstr, default ‘.’: Character recognized as decimal separator, e.g. ‘,’ in Europe.
multicolumnbool, default True: Use multicolumn to enhance MultiIndex columns. The default will be read from the config module, and is set as the option styler.sparse.columns.

Changed in version 2.0.0: The pandas option affecting this argument has changed.
multicolumn_formatstr, default ‘r’: The alignment for multicolumns, similar to column_format The default will be read from the config module, and is set as the option styler.latex.multicol_align.

Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to “r”.
multirowbool, default True: Use multirow to enhance MultiIndex rows. Requires adding a usepackage{{multirow}} to your LaTeX preamble. Will print centered labels (instead of top-aligned) across the contained rows, separating groups via clines. The default will be read from the pandas config module, and is set as the option styler.sparse.index.

Changed in version 2.0.0: The pandas option affecting this argument has changed, as has the default value to True.
captionstr or tuple, optional: Tuple (full_caption, short_caption), which results in \caption[short_caption]{{full_caption}}; if a single string is passed, no short caption will be set.
labelstr, optional: The LaTeX label to be placed inside \label{{}} in the output. This is used with \ref{{}} in the main .tex file.
positionstr, optional: The LaTeX positional argument for tables, to be placed after \begin{{}} in the output.

Returns

str or None: If buf is None, returns the result as a string. Otherwise returns None.

Notes

As of v2.0.0 this method has changed to use the Styler implementation as part of Styler.to_latex() via jinja2 templating. This means that jinja2 is a requirement, and needs to be installed, for this method to function. It is advised that users switch to using Styler, since that implementation is more frequently updated and contains much more flexibility with the output.

Examples

Convert a general DataFrame to LaTeX with formatting:

>>> df = pd.DataFrame(dict(name=['Raphael', 'Donatello'],
...                        age=[26, 45],
...                        height=[181.23, 177.65]))
>>> print(df.to_latex(index=False,
...                   formatters={"name": str.upper},
...                   float_format="{:.1f}".format,
... ))  
\begin{tabular}{lrr}
\toprule
name & age & height \\
\midrule
RAPHAEL & 26 & 181.2 \\
DONATELLO & 45 & 177.7 \\
\bottomrule
\end{tabular}

saveMATFILE(filename: str) → None

Save the current SisypheSheet instance to a Matlab file (.mat).

Parameters

filenamestr: Matlab file name (.mat)

saveNPY(filename: str) → None

Save the current SisypheSheet instance to a Numpy binary file (.npy).

Parameters

filenamestr: Numpy binary file name (.npy)

saveNPZ(filename: str) → None

Save the current SisypheSheet instance to a Numpy binary compressed file (.npz).

Parameters

filenamestr: Numpy binary compressed file name (.npz)

saveTXT(filename: str, sep: str = '|') → None

Save the current SisypheSheet instance to a text file (.txt).

Parameters

filenamestr: text file name (.txt)
sepstr: optional (default ‘|’), char separator between values

saveXLSX(excel_writer: FilePath | WriteExcelBuffer | ExcelWriter, *, sheet_name: str = 'Sheet1', na_rep: str = '', float_format: str | None = None, columns: Sequence[Hashable] | None = None, header: Sequence[Hashable] | bool_t = True, index: bool_t = True, index_label: IndexLabel | None = None, startrow: int = 0, startcol: int = 0, engine: Literal['openpyxl', 'xlsxwriter'] | None = None, merge_cells: bool_t = True, inf_rep: str = 'inf', freeze_panes: tuple[int, int] | None = None, storage_options: StorageOptions | None = None, engine_kwargs: dict[str, Any] | None = None) → None

Write object to an Excel sheet.

To write a single object to an Excel .xlsx file it is only necessary to specify a target file name. To write to multiple sheets it is necessary to create an ExcelWriter object with a target file name, and specify a sheet in the file to write to.

Multiple sheets may be written to by specifying unique sheet_name. With all data written to the file it is necessary to save the changes. Note that creating an ExcelWriter object with a file name that already exists will result in the contents of the existing file being erased.

Parameters

excel_writerpath-like, file-like, or ExcelWriter object: File path or existing ExcelWriter.
sheet_namestr, default ‘Sheet1’: Name of sheet which will contain DataFrame.
na_repstr, default ‘’: Missing data representation.
float_formatstr, optional: Format string for floating point numbers. For example float_format="%.2f" will format 0.1234 to 0.12.
columnssequence or list of str, optional: Columns to write.
headerbool or list of str, default True: Write out the column names. If a list of string is given it is assumed to be aliases for the column names.
indexbool, default True: Write row names (index).
index_labelstr or sequence, optional: Column label for index column(s) if desired. If not specified, and header and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
startrowint, default 0: Upper left cell row to dump data frame.
startcolint, default 0: Upper left cell column to dump data frame.
enginestr, optional: Write engine to use, ‘openpyxl’ or ‘xlsxwriter’. You can also set this via the options io.excel.xlsx.writer or io.excel.xlsm.writer.
merge_cellsbool, default True: Write MultiIndex and Hierarchical Rows as merged cells.
inf_repstr, default ‘inf’: Representation for infinity (there is no native representation for infinity in Excel).
freeze_panestuple of int (length 2), optional: Specifies the one-based bottommost row and rightmost column that is to be frozen.
storage_optionsdict, optional: Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

Added in version 1.2.0.
engine_kwargsdict, optional: Arbitrary keyword arguments passed to excel engine.

Notes

For compatibility with to_csv(), to_excel serializes lists and dicts to strings before writing.

Once a workbook has been saved it is not possible to write further data without rewriting the whole workbook.

Examples

Create, write to and save a workbook:

>>> df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],
...                    index=['row 1', 'row 2'],
...                    columns=['col 1', 'col 2'])
>>> df1.to_excel("output.xlsx")  

To specify the sheet name:

>>> df1.to_excel("output.xlsx",
...              sheet_name='Sheet_name_1')  

If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object:

>>> df2 = df1.copy()
>>> with pd.ExcelWriter('output.xlsx') as writer:  
...     df1.to_excel(writer, sheet_name='Sheet_name_1')
...     df2.to_excel(writer, sheet_name='Sheet_name_2')

ExcelWriter can also be used to append to an existing Excel file:

>>> with pd.ExcelWriter('output.xlsx',
...                     mode='a') as writer:  
...     df1.to_excel(writer, sheet_name='Sheet_name_3')

To set the library that is used to write the Excel file, you can pass the engine keyword (the default engine is automatically chosen depending on the file extension):

>>> df1.to_excel('output1.xlsx', engine='xlsxwriter')  

toClipboard(*, excel: bool = True, sep: str | None = None, **kwargs) → None

Copy object to the system clipboard.

Write a text representation of object to the system clipboard. This can be pasted into Excel, for example.

Parameters

excelbool, default True

Produce output in a csv format for easy pasting into excel.

True, use the provided separator for csv pasting.
False, write a string representation of the object to the clipboard.

sepstr, default '\t'

Field delimiter.

**kwargs

These parameters will be passed to DataFrame.to_csv.

Notes

Requirements for your platform.

Linux : xclip, or xsel (with PyQt4 modules)

Windows : none

macOS : none

This method uses the processes developed for the package pyperclip. A solution to render any output string format is given in the examples.

Examples

Copy the contents of a DataFrame to the clipboard.

>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['A', 'B', 'C'])

>>> df.to_clipboard(sep=',')  
... # Wrote the following to the system clipboard:
... # ,A,B,C
... # 0,1,2,3
... # 1,4,5,6

We can omit the index by passing the keyword index and setting it to false.

>>> df.to_clipboard(sep=',', index=False)  
... # Wrote the following to the system clipboard:
... # A,B,C
... # 1,2,3
... # 4,5,6

Using the original pyperclip package for any string output format.

import pyperclip
html = df.style.to_html()
pyperclip.copy(html)

Sisyphe.core.sisypheSheet

External packages/modules

Description

Inheritance

Parameters

Returns

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Returns

Parameters

Parameters

Parameters

Returns

See Also

Notes

Examples

Parameters

Returns

See Also

Notes

Examples

Parameters

Parameters

Parameters

Parameters

Parameters

See Also

Notes

Examples

Parameters

See Also

Notes

Examples