Make it work, then make it fast Atlantbh had the opportunity to implement a back-end software solution for an insurance company. The client required a data processing engine able to transform different sizes of data varying from a few kilobytes to dozens of gigabytes, using the same mechanism. Having a…
August 10, 2022
Data hunters – How Big Data changed the world (of golf)
Ever-increasing amounts of big data in health care, data gathered through mobile applications, fitness data… It is considered an advantage that these data are automatically collected and analyzed using all sorts of analytical tools, machine learning, AI technologies, or data mining. Still, the human factor is the core of how…
December 1, 2020
Amazon Elastic MapReduce web service
Introduction Hadoop as a platform enables us to store and process vast amounts of data. Storage capacity and processing power are directly related to the size of our Hadoop cluster and scaling it up is a simple as adding new nodes to the cluster. (more…)
January 6, 2013
HBase – Row level timestamp consistency in HBase
Multiple dimensions of data are supported by having multiple versions of each HBase table cell. This enables identification of each cell by 3 keys: row key, column, and version. (more…)
January 4, 2013
Hadoop – Identifying and skipping processed data
A paper on identifying and skipping processed data – an effort to minimize cloud resource wasting in Hadoop when processing data from HDFS. (more…)