Metadata-Version: 2.1
Name: sparkdatachallenge
Version: 0.1.1
Summary: <Enter a one-sentence description of this project here.>
Home-page: https://github.com/tomerten/sparkdatachallenge
License: MIT
Keywords: packaging,poetry
Author: Tom Mertens
Author-email: your.email@whatev.er
Requires-Python: >=3.8,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: ipykernel (>=5.5.5,<6.0.0)
Requires-Dist: isort (>=5.8.0,<6.0.0)
Requires-Dist: matplotlib (>=3.4.2,<4.0.0)
Requires-Dist: nbsphinx (>=0.8.5,<0.9.0)
Requires-Dist: numpy (>=1.20.3,<2.0.0)
Requires-Dist: pandas (>=1.2.4,<2.0.0)
Requires-Dist: pylint (>=2.8.2,<3.0.0)
Requires-Dist: sympy (>=1.8,<2.0)
Project-URL: Repository, https://github.com/tomerten/sparkdatachallenge
Description-Content-Type: text/x-rst

==================
sparkdatachallenge
==================



Sparkdata challenge for finding multiplicative pairs in a sorted array of decimal numbers that
are constructed from two arrays (A,B), one containing the integer part and one containing the decimal part
but as an integer.

The decimal numbers are then constructed as following:
C[i] = A[i] + B[i] / scale 

where the scale is a fixed number (here 1_000_000).

* Free software: MIT license
* Documentation: https://sparkdatachallenge.readthedocs.io.


Features
--------

* Brute force method that fails due to memory allocation for large arrays but only uses numpy vectorized functions
* Brute force method based on a double for-loop
* Math based method - optimized using mathematical properties of the inequalities and leveraging that the decimal number array C is sorted.
  

