Let’s learn about Data Storage via these 95 free blog posts. They are ordered by HackerNoon reader engagement data. Visit the Learn Repo or LearnRepo.com to find the most read blog posts about any technology.
Data storage refers to the process and technologies used for preserving digital information. It matters for ensuring data persistence, availability, and integrity, which are critical for business operations, personal records, and scientific research in an increasingly data-driven world.
1. The Long Now of the Web: Inside the Internet Archive’s Fight Against Forgetting

A deep dive into the Internet Archive’s custom tech stack.
2. Storing and Retrieving Data in Next.js Using LocalStorage and TypeScript

Want to persist data in Next.js without using a lib or any third parties to achieve that, and still have the data not change or reset after refresh?
3. Why Are Removed Posts Still Visible on Reddit?

Even if moderators delete a post that is breaking the rules of Reddit, it is still very easy to find.
4. The Principles to Keep In Mind When Building a Modern Datalake for Your AI Infrastructure

The AI game is about performance at scale, and this requires the right foundation. Here’s how to be smart when building a modern datalake.
5. Managing Stateful Applications in Containerized Environments

Learn essential tips for handling stateful applications in container setups. Discover efficient strategies for seamless management in containerized environments
6. Hadoop Across Multiple Data Centers

Hadoop cluster across multiple data centers
7. Top 10 Javascript File Managers to Use in 2022

A brief intro to 10 file managers for software developers
8. An Architect’s Guide to the Top 10 Tools Needed to Build the Modern Data Lake

Here is a list of vendors and tools needed to build the modern data lake, with each entry a capability needed to support generative AI.
9. Managing Large Data Volumes With MinIO, Langchain and OpenAI

A practical guide to integrating MinIO, Langchain and OpenAI’s GPT-3.5 model focusing on summarizing documents stored in MinIO buckets.
10. How to Use MinIO as External Tables to Extend Snowflake

MinIO is a high-performance, cloud native object store. Because of this, MinIO can become the global datastore for Snowflake customers, wherever their data sits
11. Setting Up MinIO With Quickwit

MinIO is the right choice for Quickwit because of its industry-leading performance and scalability.
12. Unlocking the Power of Data Lakes for Embedded Analytics in Multi-Tenant SaaS

Discover why data lakes are superior to traditional data warehouses for embedded analytics in SaaS applications.
13. Everything You Need to Know to Deploy MinIO in Virtualized Environments

When deploying MinIO in virtualized environments, it’s important to make sure that the proper conditions are in place.
14. How Erasure Coding is Applied for Data Protection

Erasure coding is applied to data protection for distributed storage because it is resilient and efficient.
15. Protecting Software-defined Object Storage With MinIO’s Replication Best Practices

MinIO includes several ways to replicate data so you can choose the best methodology to meet your needs.
16. Scaling Ethereum: Data Bloat, Data Availability, and the Cloudless Solution

Determining how to persist Ethereum’s excess data will allow it to scale indefinitely into the future, and Codex has arrived to help.
17. A Beginner’s Guide to Understanding SQL Window Functions and Their Capabilities

Welcome to the world of SQL and Window functions! If you’re just starting out, you’re in the right place.
18. Make it Rain: How Repatriating Your Public Cloud Workload Can Save You Millions

A high performance, cloud-native object store offers you economic benefits, performance benefits, control benefits – and they compound with scale.
19. The State of Cloud Storage: #Decentralize-Cloud

The cloud storage market is 80% owned by 3 mega tech companies, and – if we want to keep our data safe – that has to change.
20. Top Players in the Hyperconverged Infrastructure Software Market for 2022

This blog examines the top players in the Hyperconverged Infrastructure software market for 2022, key trends in the market, and investigates their impact.
21. A Brief Introduction to Ethereum Swarm

The idea for Swarm came from Gavin Wood, one of the founders of Ethereum.
22. Silicon Valley’s Pied Piper is Now Real Thanks to New Compression Technology

HBO’s Silicon Valley imagined data compression with Pied Piper. Fast forward to 2024, and real-world startups like SQream Blue are making that dream a reality.
23. Migrate to AI-Ready Infrastructure: Hitachi Content Platform to MinIO

Developed to support customers’ evolving storage needs, the HCP-to-MinIO tool is freely available on GitHub and greatly simplifies the migration process.
24. Decentralized Cloud Storage is changing the face of the internet (1/2)

(Read Part 2 here)
25. Sustainable Computing beyond the Cloud

Extreme increases in data streams are expanding the cloud’s carbon footprint; a sustainable alternative to Cloud dependence has been developed.
26. Efficient Data Storage for Rapid Analysis and Visualization

In this article, I want to share one of the ways that big data can be stored and used for analysis.
27. AR.IO Built ArNS So You Never Lose a Website — Ever

The web is filled with broken links and broken dreams… AR.IO has built ArNS to put an end to this misery
28. Digging into Postgres’s Lesser Known Features

Postgres Handles More than You Think
29. 10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

In this article, you can find ten actionable methods to protect your mission-critical database.
30. Hot-Cold Data Separation: How It Cuts Your Storage Costs by 70%

Apparently hot-cold data separation is hot now. Let’s figure out why.
31. How to Increase Space in C Drive on Windows 10 Without Losing Data

If you receive a low disk space warning on your C: drive, you may use the Disk Cleanup utility to remove temporary and unwanted downloaded files.
32. How to Build a Database From Scratch: Understanding LSM Trees and Storage Engines (Part 1)

Learn core database concepts by implementing a Python key-value store with crash recovery and efficient writes.
33. Books vs. Servers: Where Should We Store All of the Human Knowledge?

We are used to storing our information in the cloud, which has clear benefits over paper databases and books. What is the best place to keep human knowledge?
34. A Platform-Agnostic Approach in Cloud Security for Data Engineers

Discover a platform-agnostic approach to cloud security for data engineers. Strengthen the defenses with encryption, zero-trust models, and multi-cloud tools.
35. Replacing Apache Hive, Elasticsearch and PostgreSQL with Apache Doris

Simplicity is the best policy.
36. An Analysis of Key Players in the Software Defined Storage (SDS) Market for 2022

This blog examines the top Software-Defined-Storage (SDS) market players for 2022 and highlights the importance of each one as an enterprise solution.
37. Decentralized Storage Networks — An Explainer

A comprehensive analysis of decentralized storage networks, technologies behind them, benefits, use cases, current issues, and an overview of DSN offerings
38. NFTs – Exploring Infrastructure, Usability, Role in DeFi, And Questions About Ownership

Walk through NFT Standard, NFT characteristic traits and explanations, NFT utilisation in DeFi.
39. Kafka Storage Design - Making File Systems Cool Again!

What makes Kafka so Fast? A Deep Dive into Kafka Storage Internals.
40. 5 Ways to Store Market Data: CSV, SQLite, Postgres, Mongo, Arctic

What’s the most efficient way to store market data? SQL or NoSQL? Let’s compare 5 most common options and find out what is best.
41. “We plan to establish the first IPFS node on Mars” – says Filecoin Miner Neo Ge

As Filecoin gears up for launch, miners across the globe have been participating in Space Race, competing to onboard as much storage as possible to the testnet.
42. Data Lakehouses: The New Data Storage Model

Data lakehouses are quickly replacing old storage options like data lakes and warehouses. Read on for the history and benefits of data lakehouses.
43. The Best Options to Store Data and Keep it Safe Forever

As technology evolved the options for storing the large amount of data have also changed. In this article we’ve discussed the terms Archiving and Backups.
44. Beyond Container Orchestration – Kublr’s Approach to Kubernetes Infrastructure Abstraction

[This post is inspired by an Interview with Kublr CTO, Oleg Chunikhin]
45. The Shortcomings of Computer-controlled Robots

Computer-controlled robots are monotonous. They are mostly able to perform a sequence of processing operations that is fixed by the equipment configuration and
46. How High-Quality Datasets Can Revolutionize Business Outcomes with Machine Learning

The accuracy of a machine learning model is a measure of how well it can make predictions on new, unseen data.
47. What Web3 & Decentralization Mean for Data Storage

I think of web3 and decentralization on a spectrum – it’s not just one or the other, but you can take incremental steps on the path towards your end goal.
48. Benefits of Corporate Data Backup and Best Practices to Keep in Place

Nowadays, companies are increasingly relying on corporate data backup solutions to guarantee the safety and recoverability of their data. Read on to learn more
49. Debugging Mobile App Database Issues & Optimizing Data Storage Performance

Learn the best techniques to debug mobile app database issues and optimize data storage performance for enhanced mobile app performance.
50. 8 Common Data Security Gaps in Health Care

Health care data security is crucial but can be challenging. Here are the most common data security gaps to address.
51. Building Serverless Notification Architecture Design Using AWS: A Step-by-Step Guide for Developers

Explore the technical advantages of aws serverless architectures for notification systems. Learn about scalability, cost-efficiency, and security considerations
52. MongoDB vs. DynamoDB: Choosing the Best Database for Your Business

All about MongoDB vs DynamoDB. Explore benefits, and in-depth comparison to find out the best choice for your business app.
53. What is a Distributed Storage Network on a Blockchain System?

A Distributed Storage Network (DSN) is a peer-to-peer network based on blockchain. It is a decentralized and distributed network that provides storage.
54. 16 Guides to Get You Started with Apache Iceberg

These guides are designed to provide you with practical experience in working with Apache Iceberg.
55. How To Fix: The File is Too Large for the Destination File System Error

Have you ever encountered this error “the file is too large for the destination file system” while moving files from your windows PC to USB flash drive or external hard drive?
56. Storing data with Vinyl

This article describes how the developers of the in-memory computing platform Tarantool implemented disk storage.
57. Scaling PostgreSQL Databases Just Got Cheaper: Timescale’s Tiered Storage Hits General Availability

Timescale’s Tiered Storage, now in General Availability, introduces a multi-tiered storage architecture for PostgreSQL databases.
58. Compression in Big Data: Types and Techniques

This article will discuss compression in the Big Data context, covering the types and methods of compression
59. What Is the Use of a Linked List Class?

Whether you’re a beginner programmer or an experienced developer, understanding the linked list class is essential.
60. Enhancing Scalability with Off-Chain Data Storage in Blockchain Ecosystems

Exploring the role of off-chain data storage in blockchain technology, and how off-chain solutions can enhance scalability.
61. 9 Data Trends You’ll See in 2023

2022 saw the data space grow by leaps and bounds. Here are the top 9 things our team of data experts expects to see in 2023.
62. Decentralizing Data Storage on the Blockchain: An Exclusive Interview With Vincent Irlweck

Interview with Vincent Irlweck, CMO at Inery Blockchain to discuss their decentralized data storage solution and democratizing the data industry.
63. “We want functional decentralization” Q&A with Wildland Creators

Wildland is a new, open data management protocol with improved users’ privacy, security, and multi-categorization. A Q&A with J. Zawistowski and A. Regulski.
64. Distributed Data Store and Transaction Sagas

This is a tutorial on how to create a distributed data store by implementing a leader based replication.
65. How to handle your startup data like a big tech

Core principles in data management that all big tech companies adhere to can and should be adopted by startups.
66. Mounting Web3 Storage as a Folder on Your Desktop

Learn in this post how to mount Web3 Decentralized Storage as a folder on your desktop. Drag and Drop onto Web3 in seconds using Filebase and Mountain Duck.
67. The Benefits of Amazon S3 Explained Through a Comic

AWS S3 is one of the most fundamental services of AWS Cloud.
68. Accelerate Spark and Hive Jobs on AWS S3 by 10x with Alluxio as a Tiered Storage Solution

In this article, Thai Bui describes how Bazaarvoice leverages Alluxio as a caching tier on top of AWS S3 to maximize performance and minimize operating costs on running Big Data analytics on AWS EC2. The original article can be found on Alluxio’s engineering blog.
69. Understanding Elasticsearch Reindexing: When to Reindex, Best Practices and Alternatives

Whether you’re a seasoned Elasticsearch user or just beginning your journey, understanding reindexing is important for maintaining an efficient cluster.
70. How Does Economic Recession Affect Colocation?

Some tout data storage as recession-proof, but with little precedence to learn from, what will a recession look like for colocation?
71. Optimizing Web Apps for High Traffic: Load Balancing, Caching, Database Performance

We live with the constant need for web apps to effectively process high volumes of data exchange. In this, we explore some tried and tested techniques to do so.
72. Distributed Ledgers: The Next Logical Step

How modern blockchain approaches the problem of data storage in decentralized systems and how a distributed ledger can be organized.
73. 49 Stories To Learn About Data Storage

Learn everything you need to know about Data Storage via these 49 free HackerNoon stories.
74. Optimized Metadata Loading Process on ShardingSphere: A Technical Deep-Dive

The powerful database middleware ShardingSphere core functions such as data sharding, encryption and decryption are all based on the database metadata.
[75. How Velocity Can Change Data Storage And Propagation In
Blockchain Nodes?](https://hackernoon.com/how-velocity-can-change-data-storage-and-propagation-in-blockchain-nodes-07x3z8b)

Blockchain technologies have been disruptive and propagate the democratization of the data. This has led to its wider usage and popularity. The blockchain paradigm has shifted from financial usage towards application development usages.
76. LLMs: How to Build AI Superintelligence? [Hint: Storage]

LLMs: Storage architectures from conceptual brain science could be the innovation channel for AI superintelligence, as well as energy efficient semiconductors.
77. All About Parquet Part 01 – An Introduction

Discover Apache Iceberg with a free guide, crash course, and video playlist. Learn efficient data management and processing for big data environments.
78. How to Protect your Business from Any Type of Criminals: The Complete Guide

Today’s modern world is, undoubtedly, not a safe haven for any business. Make no mistake, even running a small-time operation or setting up a niche venture can become challenging. But surprisingly enough, the major brands like Sony or top cryptocurrency exchanges such as Binance aren’t under the greatest threat – criminals and hackers mostly see their prey under a much lower grade. Network security is one of the stumbling stones for businesses of any scale nowadays. Digital age fraudsters rarely engage in assaulting the top corporations – the largest chunk of their bounty comes from the companies that would never make it to the Forbes list.
79. How To Restore Your Database From a SQL Backup

Backing up SQL and databases using the manual restore method, the full database backup, the incremental restore, or the manageauditing command.
80. What Is Redis and How Can It Make Your Website 30-40% Faster?

Redis is a type of database that can be used to significantly improve your website’s loading speed thanks to its design and its versatile selection of modules.
81. How to Manage Data Residency

After explaining the theory behind Data Residency, It’s time to get our hands dirty and implement it in a simple demo.
82. How to Perform a Cyber Security Risk Assessment: A Step-by-Step Guide
Companies are increasingly spending money on cyber security. However, attackers are launching more sophisticated cyber attacks that are hard to detect, and businesses often suffer severe consequences from them.
83. Win $10k AKT in the Filebase + Akash Hackathon

Join the Filebase + Akash Hackathon that is from 1 September to 30 October and compete to win up to $10,000 AKT and up to $250 in free object storage.
84. How to Make Your Own and Free Backup Application

In our age of rapidly developing technologies, data loss can be a disaster not only for large corporations, but also for the average user, showcasing the immense importance of backup and data recovery in today’s data driven world.
85. The Data Migration Tax: Paying Interest on the Myth of 3-2-1

This is the first article for the Data Migration Tax that first appeared on my LinkedIn September 22nd 2025
86. What are the Best Options to Store Data and Keep it Safe Forever?

This article will go over the most effective, economical and long-lasting methods for storing our data.
87. Making Sense of Unbounded Data & Real-Time Processing Systems

A real-time processing architecture should have these logical components to address event ingestion & processing challenges, such as a stream processing system.
88. The Evolution Of Hacking Data Storage [Infographic]

When the first computers were made, the information needed to run them was on punch cards. The computing device would decode the patterns on the punch cards and translate it to an action. It wasn’t until 1956 that IBM came up with the first magnetic hard drive, and floppy discs didn’t enter the scene until the 1960s. Early computer storage was rudimentary, which is why there was no real viable threat of hackers in those early decades of computing. Once data storage became more sophisticated, hackers became a real threat. Subsequently, the need for cyber protection was born.
89. Data Storage Security: 5 Best Practices to Secure Your Data
Data is undoubtedly one of the most valuable assets of an organization. With easy-to-use and affordable options such as cloud-based storage environments, storing huge amounts of data in one place has become almost hassle-free. However, space is not the only concern for businesses any more.
90. How Does Cloud Storage Work?

Cloud storage is the technology that allows users and companies to store, maintain and access data on highly available servers via the internet.
91. Your “18-Month Migration Plan” Is a Fairy Tale

The myth of the 18-month migration plan collapses under real-world queueing, verification debt, and people constraints. Model like a factory or fail.
92. Transfer Big Data Across Cloud Platforms With Ease

With Big Data and the rise of more affordable object storage such as; Wasabi Cloud Storage and Backblaze B2, the need to move large amounts of data from the big three cloud providers is trending. But how do you do it? Often it is is near impossible, or extremely hard to break up with someone like Jeff Bezos and go somewhere with cheaper rates.
93. B2B Tech: What is New Enterprise and Why is Everybody Talking About It?

“New Enterprise” is an approach to business that is quickly gaining momentum in many sectors, especially tech: We explore what it is and why it is matters.
94. How Can Businesses Overcome Common Challenges in Data Storage Performance?

Discover effective strategies to overcome common data storage performance challenges faced by businesses, ensuring efficient and optimized operations.
95. SeaTunnel Cluster Avoids Downtime With Advanced JVM and GC Tweaks

Check out how we optimized JVM configs in Apache SeaTunnel to eliminate Full GC, boost stability, and ensure smooth data syncs.
Thank you for checking out the 95 most read blog posts about Data Storage on HackerNoon.
Visit the /Learn Repo to find the most read blog posts about any technology.