glide.extensions.pandas module¶
-
class
glide.extensions.pandas.DataFrameApplyMap(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeApply a transform to a Pandas DataFrame
-
class
glide.extensions.pandas.DataFrameBollingerBands(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFrameRollingNodeCompute bollinger bands for the specified columns in a DataFrame
-
class
glide.extensions.pandas.DataFrameCSVExtract(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFramePushExtract data from a CSV using Pandas
-
class
glide.extensions.pandas.DataFrameCSVLoad(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeLoad data into a CSV from a Pandas DataFrame
-
run(df, f, push_file=False, dry_run=False, **kwargs)[source]¶ Use Pandas to_csv to output a DataFrame
- Parameters
df (pandas.DataFrame) – DataFrame to load to a CSV
f (file or buffer) – File to write the DataFrame to
push_file (bool, optional) – If true, push the file forward instead of the data
dry_run (bool, optional) – If true, skip actually loading the data
**kwargs – Keyword arguments passed to DataFrame.to_csv
-
-
class
glide.extensions.pandas.DataFrameExcelExtract(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFramePushExtract data from an Excel file using Pandas
-
run(f, **kwargs)[source]¶ Extract data for input file and push as a DataFrame. This will push a DataFrame or dict of DataFrames in the case of reading multiple sheets from an Excel file.
- Parameters
f – file or buffer to be passed to pandas.read_excel
**kwargs – kwargs to be passed to pandas.read_excel
-
-
class
glide.extensions.pandas.DataFrameExcelLoad(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeLoad data into an Excel file from a Pandas DataFrame
-
run(df_or_dict, f, push_file=False, dry_run=False, **kwargs)[source]¶ Use Pandas to_excel to output a DataFrame
- Parameters
df_or_dict – DataFrame or dict of DataFrames to load to an Excel file. In the case of a dict the keys will be the sheet names.
f (file or buffer) – File to write the DataFrame to
push_file (bool, optional) – If true, push the file forward instead of the data
dry_run (bool, optional) – If true, skip actually loading the data
**kwargs – Keyword arguments passed to DataFrame.to_excel
-
-
class
glide.extensions.pandas.DataFrameHTMLExtract(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeExtract data from HTML tables using Pandas
-
class
glide.extensions.pandas.DataFrameHTMLLoad(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.Node-
run(df, f, push_file=False, dry_run=False, **kwargs)[source]¶ Use Pandas to_html to output a DataFrame
- Parameters
df (pandas.DataFrame) – DataFrame to load to an HTML file
f (file or buffer) – File to write the DataFrame to
push_file (bool, optional) – If true, push the file forward instead of the data
dry_run (bool, optional) – If true, skip actually loading the data
**kwargs – Keyword arguments passed to DataFrame.to_html
-
-
class
glide.extensions.pandas.DataFrameMethod(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeHelper to execute any pandas DataFrame method
-
class
glide.extensions.pandas.DataFrameMovingAverage(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFrameRollingNodeCompute a moving average on a DataFrame
-
class
glide.extensions.pandas.DataFramePush(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.Node,glide.extensions.pandas.DataFramePushMixinBase class for DataFrame-based nodes
-
class
glide.extensions.pandas.DataFramePushMixin[source]¶ Bases:
objectShared logic for DataFrame-based nodes
-
do_push(df, chunksize=None)[source]¶ Push the DataFrame to the next node, obeying chunksize if passed
- Parameters
df (pandas.DataFrame) – DataFrame to push, or chunks of a DataFrame if the chunksize argument is passed and truthy.
chunksize (int, optional) – If truthy the df argument is expected to be chunks of a DataFrame that will be pushed individually.
-
-
class
glide.extensions.pandas.DataFrameRollingNode(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.NodeApply df.rolling to a DataFrame
-
compute_stats(df, rolling, column_name)[source]¶ Override this to implement logic to manipulate the DataFrame
-
run(df, windows, columns=None, suffix=None, **kwargs)[source]¶ Use df.rolling to apply a rolling window calculation on a dataframe
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html
- Parameters
df (pandas.DataFrame) – The pandas DataFrame to process
windows (int or list of ints) – Size(s) of the moving window(s). If a list, all windows will be calculated and the window size will be appended as a suffix.
columns (list, optional) – A list of columns to calculate values for
suffix (str, optional) – A suffix to add to the column names of calculated values
**kwargs – Keyword arguments passed to df.rolling
-
-
class
glide.extensions.pandas.DataFrameRollingStd(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFrameRollingNodeCompute a rolling standard deviation on a DataFrame
-
class
glide.extensions.pandas.DataFrameRollingSum(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.extensions.pandas.DataFrameRollingNodeCompute a rolling window sum on a DataFrame
-
class
glide.extensions.pandas.DataFrameSQLExtract(*args, **kwargs)[source]¶ Bases:
glide.extensions.pandas.PandasSQLNodeExtract data from a SQL db using Pandas
-
class
glide.extensions.pandas.DataFrameSQLLoad(*args, **kwargs)[source]¶ Bases:
glide.extensions.pandas.PandasSQLNodeLoad data into a SQL db from a Pandas DataFrame
-
run(df, conn, table, push_table=False, dry_run=False, **kwargs)[source]¶ Use Pandas to_sql to output a DataFrame
- Parameters
df (pandas.DataFrame) – DataFrame to load to a SQL table
conn – Database connection
table (str) – Name of a table to write the data to
push_table (bool, optional) – If true, push the table forward instead of the data
dry_run (bool, optional) – If true, skip actually loading the data
**kwargs – Keyword arguments passed to DataFrame.to_sql
-
-
class
glide.extensions.pandas.DataFrameSQLTableExtract(*args, **kwargs)[source]¶ Bases:
glide.extensions.pandas.PandasSQLNodeExtract data from a SQL table using Pandas
-
run(table, conn, where=None, limit=None, **kwargs)[source]¶ Extract data for input table and push as a DataFrame
- Parameters
table (str) – SQL table to query
conn – A SQL database connection
where (str, optional) – A SQL where clause
limit (int, optional) – Limit to put in SQL limit clause
**kwargs – kwargs to be passed to pandas.read_sql
-
-
class
glide.extensions.pandas.DataFrameSQLTempLoad(*args, **kwargs)[source]¶ Bases:
glide.extensions.pandas.PandasSQLNodeLoad data into a SQL temp table from a Pandas DataFrame
-
run(df, conn, schema=None, dry_run=False, **kwargs)[source]¶ Use Pandas to_sql to output a DataFrame to a temporary table. Push a reference to the temp table forward.
- Parameters
df (pandas.DataFrame) – DataFrame to load to a SQL table
conn – Database connection
schema (str, optional) – schema to create the temp table in
dry_run (bool, optional) – If true, skip actually loading the data
**kwargs – Keyword arguments passed to DataFrame.to_sql
-
-
class
glide.extensions.pandas.FromDataFrame(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.Node
-
class
glide.extensions.pandas.PandasSQLNode(*args, **kwargs)[source]¶ Bases:
glide.sql.BaseSQLNode,glide.extensions.pandas.DataFramePushMixinCaptures the connection types allowed to work with Pandas to_sql/from_sql
-
allowed_conn_types= [<class 'sqlalchemy.engine.base.Connection'>, <class 'sqlalchemy.engine.interfaces.Connectable'>, <class 'sqlite3.Connection'>]¶
-
-
class
glide.extensions.pandas.ToDataFrame(name, _log=False, _debug=False, **default_context)[source]¶ Bases:
glide.core.Node