Category Archives: Spark

Convert A Spark DataFrame with Date Columns to Pandas DataFrame using Apache Arrow

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. After install pyarrow package, we can convert Spark DataFrame … Continue reading

Posted in Python, Spark | Leave a comment

LOBSTER on Spark

A new LOBSTER engine has been built on Spark. The demo Jupyter-notebook is available at here .

Posted in LOBSTER Data, Python, Spark | Leave a comment