rsnt_translations
56,430
edits
No edit summary |
No edit summary |
||
Line 2: | Line 2: | ||
<translate> | <translate> | ||
<!--T:1--> | <!--T:1--> | ||
[https://arrow.apache.org/ Apache Arrow] is a cross-language development platform for in-memory data. It | [https://arrow.apache.org/ Apache Arrow] is a cross-language development platform for in-memory data. It uses a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. | ||
== CUDA == <!--T:2--> | == CUDA == <!--T:2--> | ||
Line 9: | Line 9: | ||
== Python bindings == <!--T:3--> | == Python bindings == <!--T:3--> | ||
The module contains bindings for multiple | The module contains bindings for multiple Python versions. | ||
To discover which are the compatible Python versions | To discover which are the compatible Python versions, run | ||
{{Command|module spider arrow/0.16.0}} | {{Command|module spider arrow/0.16.0}} | ||
=== PyArrow === <!--T:4--> | === PyArrow === <!--T:4--> | ||
The Arrow Python bindings (also named ''PyArrow'') have first-class integration with NumPy, | The Arrow Python bindings (also named ''PyArrow'') have first-class integration with NumPy, Pandas, and built-in Python objects. They are based on the C++ implementation of Arrow. | ||
<!--T:5--> | <!--T:5--> | ||
1. Load the required modules | 1. Load the required modules. | ||
{{Command|module load gcc/8.3.0 arrow/0.16.0 python/3.7 scipy-stack}} | {{Command|module load gcc/8.3.0 arrow/0.16.0 python/3.7 scipy-stack}} | ||
<!--T:6--> | <!--T:6--> | ||
2. Import PyArrow | 2. Import PyArrow. | ||
{{Command|python -c "import pyarrow"}} | {{Command|python -c "import pyarrow"}} | ||
<!--T:7--> | <!--T:7--> | ||
If the command displays nothing, the import was successful. | |||
<!--T:8--> | <!--T:8--> | ||
For more information, see the [https://arrow.apache.org/docs/python/ Arrow Python] documentation. | For more information, see the [https://arrow.apache.org/docs/python/ Arrow Python] documentation. | ||
==== Apache Parquet | ==== Apache Parquet format ==== <!--T:9--> | ||
The [http://parquet.apache.org/ Parquet] file format is available. | The [http://parquet.apache.org/ Parquet] file format is available. | ||
<!--T:10--> | <!--T:10--> | ||
To import | To import the Parquet module, execute the previous steps for <tt>pyarrow</tt>, then run | ||
{{Command|python -c "import pyarrow.parquet"}} | {{Command|python -c "import pyarrow.parquet"}} | ||
<!--T:11--> | <!--T:11--> | ||
If the command displays nothing, the import was successful. | |||
== R bindings == <!--T:12--> | == R bindings == <!--T:12--> | ||
The | The Arrow package exposes an interface to the Arrow C++ library to access many of its features in R. This includes support for analyzing large, multi-file datasets ([https://arrow.apache.org/docs/r/reference/open_dataset.html open_dataset()]), working with individual Parquet files ([https://arrow.apache.org/docs/r/reference/read_parquet.html read_parquet()], [https://arrow.apache.org/docs/r/reference/write_parquet.html write_parquet()]) and Feather files ([https://arrow.apache.org/docs/r/reference/read_feather.html read_feather()], [https://arrow.apache.org/docs/r/reference/write_feather.html write_feather()]), as well as lower-level access to the Arrow memory and messages. | ||
=== Installation === <!--T:13--> | === Installation === <!--T:13--> | ||
1. Load the required modules | 1. Load the required modules. | ||
{{Command|module load gcc/8.3.0 arrow/0.16.0 r/3.6 boost/1.68.0}} | {{Command|module load gcc/8.3.0 arrow/0.16.0 r/3.6 boost/1.68.0}} | ||
<!--T:14--> | <!--T:14--> | ||
2. Specify the local installation directory | 2. Specify the local installation directory. | ||
{{Commands | {{Commands | ||
|mkdir -p ~/.local/R/$EBVERSIONR/ | |mkdir -p ~/.local/R/$EBVERSIONR/ | ||
Line 55: | Line 55: | ||
<!--T:15--> | <!--T:15--> | ||
3. Export the required variables to ensure | 3. Export the required variables to ensure you are using the system installation. | ||
{{Commands | {{Commands | ||
|export PKG_CONFIG_PATH{{=}}$EBROOTARROW/lib/pkgconfig | |export PKG_CONFIG_PATH{{=}}$EBROOTARROW/lib/pkgconfig | ||
Line 63: | Line 63: | ||
<!--T:16--> | <!--T:16--> | ||
4. Install the bindings | 4. Install the bindings. | ||
{{Command|R -e 'install.packages("arrow", repos{{=}}"https://cloud.r-project.org/")'}} | {{Command|R -e 'install.packages("arrow", repos{{=}}"https://cloud.r-project.org/")'}} | ||
=== Usage === <!--T:17--> | === Usage === <!--T:17--> | ||
After the bindings are installed, they have to be loaded. | |||
<!--T:18--> | <!--T:18--> | ||
1. Load the required modules | 1. Load the required modules. | ||
{{Command|module load gcc/8.3.0 arrow/0.16.0 r/3.6}} | {{Command|module load gcc/8.3.0 arrow/0.16.0 r/3.6}} | ||
<!--T:19--> | <!--T:19--> | ||
2. Load the library | 2. Load the library. | ||
{{Command | {{Command | ||
|R -e "library(arrow)" | |R -e "library(arrow)" | ||
Line 83: | Line 83: | ||
<!--T:20--> | <!--T:20--> | ||
For more information | For more information, see the [https://arrow.apache.org/docs/r/index.html Arrow R documentation] | ||
</translate> | </translate> |