๐ Lilypad Project Report: August 7
Team Offsite, Project Roadmap Detail, Filecoin Data Prep Module
Table of contents
Image โฏ lilypad run sdxl:v0.9-lilypad1 "technical project roadmap, photorealistic, fantasy, sci-fi, star trek"
Result: https://ipfs.io/ipfs/QmbvE7r6XmQRvBHmATCTQTqBZTnSAkEdNvTb9nE5JvqnPE/outputs/image-0.png
๐ Overview
Last week we were working on some of the behind-the-scenes project organisation needed at a short team offsite together in Bristol.
Milestones Release (or our technical RADmap! ;P)
Month | Product | Research | Other |
---|---|---|---|
August | - UseLilypad (try.lilypad.tech) design & build | - Research simulator set up | - Supporting Docs, Socials & content |
September | - Dataprep branding & site | - Whitepaper ready for review | - Open Data Hack Support & Judging |
October | - Testnet v2 Release (IPC) Alpha program | - Implementing attacks in simulator (cheating, collusion, timeouts) | - Supporting docs, socials & content |
November | - LLM LoRA module | - Further investigations (collateralisation, checking %, jackpots) | - Labweek (Istanbul) |
More... | - Video Module | - Further investigations (taxes, prediction markets) |
โ๏ธ Lilypad Engineering Update
๐งฝ Cleaning up
We launched the Testnet launch two weeks ago. Anyone whoโs ever done a release under time pressure will know, thereโs often a lot to clean up after! Also known as paying down some technical debt, so that is our theme currentlyโฆ
๐งช Automated testing
We managed to get the testnet out without any automated testing or automated release / deployment process. This is not sustainable as we build out the project for long term success. For sustainability, you need automated testing so that you know (without manually testing every combination of modules every time) whether youโve regressed when you make a code change!
We plan to make a lot of major changes to the codebase in the coming months, so weโre initially targeting integration or end-to-end tests, which test the behavior of the system via the external interface without relying on the internals. This makes making major changes to the internals easier, while still allowing correctness to be verified.
The initial automated tests are here: https://github.com/bacalhau-project/lilypad/blob/main/test/integration_test.go although we are currently facing flakiness because of issues using IPFS in the CI environment, which will be resolved by implementing our HTTP publisher. We are going with CircleCI because itโs the least bad option in our experience, and offers GPU runners.
๐ HTTP publisher
By publishing via HTTP directly from the compute node as well as making results available via IPFS CID, we can make the tests more reliable, as well as enabling the Filecoin data prep use case, which will require HTTP streaming to make available large amounts of data as efficiently as possible.
๐ฝ Data prep
We now have Filecoin data prep working inside Lilypad!
This will take an IPFS input CID containing some data, and output it to an IPFS output CID containing the car file necessary to onboard the data to Filecoin:
Next steps on dataprep are to add support for reading data from S3, and exposing the resulting data and metadata via HTTP so that SPs can stream it in directly (with cryptographically secure checksums on both ends).
๐ IPC
Interplanetary Consensus is our pathway to handling real funds with a fast blocktime and low gas fee which also enables high performance on-chain scheduling.
We continue to test and iterate with the IPC team on issues, the latest of which is seemingly lack of filter support in IPC or maybe FVM or the underlying Lotus implementation of Filecoin:
๐ง GPU scheduling
We have a WIP branch for only scheduling GPU jobs to GPU nodes, currently GPU jobs can get accidentally scheduled to CPU-only nodes.
โ Verification
We have a basic version of verification working in the CLI, but need to test it further and maybe record a nice demo video of it working
๐ Lilypad Research Update
Research is now splitting into two main branches: integrating the existing research into the codebase, and incorporating ideas from the continuing literature review into the existing research.
On the integrating with engineering front, we primarily focused on the structure of the two-sided marketplace. This includes the structure of resource offers (coming from compute nodes) and job offers (coming from clients), the structure of the smart contract (and making it amenable for future upgrades), the way in which messages are transported between nodes in the network, in particular to enable negotiation between autonomous agents, the structure of the mediation protocol, timeout collateral and slashing (which is separate from anti-cheating collateral and slashing), the pricing of different modules (like SDXL and dataprep) and arbitrary WASM modules, and a roadmap starting from a Modicum-like codebase (which was used to get us to testnet) and ending in a fully decentralized marketplace.
On the literature review front, we are ramping down research into anti-cheating mechanisms, and branching into the structure of negotiation between autonomous agents, distributed scheduling, and container orchestration. These three topics are very much interconnected with each other, as well as with anti-cheating mechanisms, which makes this task very engaging. For example, if a client wants to schedule a job with more than one compute node, what is the best way of doing this? Is it best for the client to negotiate with each compute node separately? Or can there be some central node that coordinates the actions of multiple compute nodes? If the latter, do some nodes have to be on standby (depends on the use case), and if so, how does that impact the cost? On top of all of that, how do we make sure that the nodes are being honest, especially if they are effectively forming compute clusters and are impacted by each othersโ performances? This is just scratching the surface of the problems of this exciting and dense topic.
Levi also appeared at the Crypto x AI Mini conf. Keep an eye out for the recording at the Valory YouTube: https://www.youtube.com/@valoryag/videos
๐ Lilypad "All the Things" Update
๐ Educational!
Based on The Filecoin Unleashed Presentation from Paris, we've released a short thread explaining Lilypad v1!
Check it out on our Twitter! (no it's not called X. That means close. Elon is cuckoo)
๐ค Melbourne Filecoin x Chainlink Meetup
If you're around Melbourne, Alison Haire is also hosting a live meetup with Chainlink at RMIT next week.
๐
Thursday, 17 August 2023 (5:00 pm to 8:00 pm AEST)
๐ซ Register here: https://www.meetup.com/en-AU/web3-melbourne/events/294657106/
๐บ CODwg Live Jam Session 2
Compute Over Data WG with HackFS winners Pablo from DefiKicks & Cem from DYieldCollector was held. If you missed it, you can catch up on the video on YouTube here.
P.S. If you have a project you want to share - get in touch!
๐ญ Project Showcase
We've added a project showcase to the docs to give you some ideas and inspiration for how to use Lilypad for your next project and celebrate the early builders working with us!
Check out all the projects here!
๐ฉโ๐ป Open Data Hack - Apply now
Speaking of hackathons, we're supporting the Encode x Filecoin Open Data Hack!
๐ 30 August - 20 September
๐ https://www.encode.club/open-data-hack
๐ฐ $6000 Bounty (Lilypad Integrations!)
๐ฎ What's Next?
We're working on publishing both an LLM inference module & a Filecoin Data Prep Module in the upcoming fortnight, as well as hoping to release Larana - the IPC Lilypad Testnet.
We're also busy putting together supporting docs and content for users to try out Lilypad, including a try.lilypad.tech website and some fully integrated examples for use.
The Research Simulator is also currently being built to test a variety of possible outcomes for the network game theory with some areas of
โ๏ธ Contact Us
๐ฌ Chat to us on Slack: bit.ly/bacalhau-project-slack #bacalhau-lilypad