David Li
lidavidm.bsky.social
David Li
@lidavidm.bsky.social
PMC member for Apache Arrow.
So I wonder if the term was already floating around the database community, and Julien (or someone else) (unintentionally?) swapped "striping" for "shredding" in the Parquet docs, and then the term took hold as Parquet became popular.
May 18, 2025 at 3:26 PM
The docs for SQL Server 2000 here talk about shredding

> OPENXML calls can be used to provide rowset view...and process them, for example, inserting them into different tables (this process is also referred to as "Shredding XML into tables")

www.microsoft.com/en-nz/downlo...
Download SQL Server 2000 Retired Technical documentation from Official Microsoft Download Center
The content you requested has already retired. It's available to download on this page.
www.microsoft.com
May 18, 2025 at 3:25 PM
This page, supposedly from 2003, talks about SQL Server 2000 adding a function to "shred" XML
> Microsoft SQL Server 2000 also provides the OPENXML function to shred an XML document and provide a rowset representation of the XML data.
web.archive.org/web/20120115...
SQL Server 2000 and XML | SQL Server 2000 XML Support | InformIT
Discover the many different ways SQL Server 2000 supports XML with a comprehensive look at both out-of-the-box support and the SQLXML 3.0 add-on.
web.archive.org
May 18, 2025 at 3:22 PM
It seems the first mention in the Parquet repos is from 2013, though. There Julien Le Dem links to a page about "striping" (as used in the Dremel paper) but calls it "shredding". So maybe you should ask him directly :)
github.com/apache/parqu...
Update README.md · apache/parquet-java@f7ba78a
github.com
May 18, 2025 at 3:18 PM
I went into a rabbit hole on "record shredding"...Here's something interesting: there's an SO question from 2008 asking about "shredding XML data into relational tables". Maybe the term sort of already existed? stackoverflow.com/questions/61...
The Best Way to shred XML data into SQL Server database columns
What is the best way to shred XML data into various database columns? So far I have mainly been using the nodes and value functions like so: INSERT INTO some_table (column1, column2, column3) SELECT
stackoverflow.com
May 18, 2025 at 3:18 PM
April is a great time to visit :)
March 21, 2025 at 6:36 AM
I'm biased but maybe things like Apache Parquet, Apache Arrow? They have multiple implementations across different languages and Arrow gets used as a means of interchange between different data vendors (Spark, BigQuery, ClickHouse <-> Pandas, polars, etc.)
March 19, 2025 at 12:00 AM