Norbert Orzechowicz
banner
norbert-tech.bsky.social
Norbert Orzechowicz
@norbert-tech.bsky.social
Problem solver, software architect, also working as a Data Witcher ⚔️, hunting and killing 🧌 and 🪳in data processing pipelines! Creator of https://flow-php.com
Which makes Flow capable of extracting data from full or partial HTML documents 💪

Our plan is to implement an extractor based on the Symfony/Panther library that would allow us to scrape data directly from the web in a single ETL pipeline.
November 12, 2025 at 11:26 AM
HTMLElementEntry can be later transformed through:

->domElementAttribute(ScalarFunction|string $attribute)
->domElementAttributesCount()
->domElementAttributeValue(ScalarFunction|string $attribute)
->domElementValue()
->domElementParent()
November 12, 2025 at 11:26 AM
But data types are nothing without functions!

- ref('response_body')->htmlQuerySelector('body div h1')
- ref('response_body')->htmlQuerySelectorAll('body p')

Both would return HTMLElementEntry.
November 12, 2025 at 11:26 AM
There are 2 new entry types in Flow DataFrame:

- entry_html
- entry_html_element

flow-php.com/documentati...
Documentation - Flow PHP - Data Processing Framework
Documentation
flow-php.com
November 12, 2025 at 11:26 AM
Two new types were added to the flow-php/types library:

- type_html
- type_html_element

You can find them among all other types at: flow-php.com/documentati...
November 12, 2025 at 11:26 AM
7/8 But it was super fun to play with compiling PHP to WASM and seeing Flow code executed in the browser 🤩

If you would like to play with the initial version of Playground, look at our Discord server where I'm sharing progress and links to the working version.
November 3, 2025 at 7:30 AM
6/8 Maybe we could also allow it to accept user-uploaded datasets so anyone could play with their own dataset (but that would make sharing more tricky, so we need to plan this properly).
November 3, 2025 at 7:30 AM
5/8 Finally, I would also like to add some predefined datasets to the playground that would be accessible from the playground filesystem and also displayed in the files tree (like in a real IDE).
November 3, 2025 at 7:30 AM
4/8 Once this is done and once we have all of those functions/methods available in simple JSON, the next step will be to create an Ace Editor autocompleter that suggests only methods available in a given context.
November 3, 2025 at 7:30 AM
3/8 That's the next phase of this project. It will require two steps. The first one will analyze the Flow codebase and generate an autocompletion matrix directly from the Flow code, which will include:

- whole DSL
- methods from the most critical API classes
November 3, 2025 at 7:30 AM
2/8 The solution is built on top of PHP compiled to WASM with Flow loaded as a phar.

It's not yet production-ready as it's missing one critical feature without which I can't use any code editors...

Autocompletion!
November 3, 2025 at 7:30 AM
Ha, great minds think alike 😅
If you ran into any issues/questions, we have a discord server (link on website) where you can catch me 😁
October 27, 2025 at 6:06 PM
Both methods batchSize and batchBy are also covered in our new examples section in documentation:

- flow-php.com/data_frame/...
- flow-php.com/data_frame/...
Flow PHP - Data frame - Batch size - Example
Code example showing batch size data frame.
flow-php.com
October 27, 2025 at 10:59 AM
The whole setup is available here: github.com/flow-php/fl...

Feedback as always appreciated ❤️
flow/.github/workflows at 1.x · flow-php/flow
The most advanced data processing framework allowing to build scalable data processing pipelines and move data between various data sources and destinations. - flow-php/flow
github.com
October 2, 2025 at 10:14 AM