#partitioning
#partitioning

[ follow ]

#time-series-storage #schema-normalization #indexing #downsampling #linux #data-migration #backups #spark #scala #rdd

Time-Series Storage: Design Choices That Shape Cost and Performance

Normalizing series identity into a metadata table and referencing it by compact IDs reduces time-series storage by about 42% in the experiment.

Software development

How to switch Linux distros and retain all of your data

Installing Linux with a separate home partition allows switching distributions without losing personal data.

Spark Scala Exercise 22: Custom Partitioning in Spark RDDsLoad Balancing and Shuffle

Custom partitioners in Spark Scala enable optimal control over data distribution for RDDs.

[ Load more ]