🍃 Lilypad Project Report: August 7

Team Offsite, Project Roadmap Detail, Filecoin Data Prep Module

🍃 Lilypad Project Report: August 7
Play this article

Image ❯ lilypad run sdxl:v0.9-lilypad1 "technical project roadmap, photorealistic, fantasy, sci-fi, star trek"

Result: https://ipfs.io/ipfs/QmbvE7r6XmQRvBHmATCTQTqBZTnSAkEdNvTb9nE5JvqnPE/outputs/image-0.png

🌐 Overview

Last week we were working on some of the behind-the-scenes project organisation needed at a short team offsite together in Bristol.

Milestones Release (or our technical RADmap! ;P)

Month

Product

Research

Other

August

- UseLilypad (try.lilypad.tech) design & build
- LLM module
- Filecoin data prep module
- Launch AI inference & fine-tuning SaaS

- Research simulator set up
- V1 docs research findings
- First simulation done

- Supporting Docs, Socials & content
- Research (AA & Social Wallets, SP hardware)
- Lilypad Learn Video's

September

- Dataprep branding & site
- LoRA SDXL module

- Whitepaper ready for review
- First results of simulation published

- Open Data Hack Support & Judging
- Waterlily re-write
- Supporting docs, socials & content
- Lilypad Learn Video's
- CODwg
- Alpha Testnet Program Planning & registration open

October

- Testnet v2 Release (IPC) Alpha program

- Implementing attacks in simulator (cheating, collusion, timeouts)
- V2 docs (based on learnings)

- Supporting docs, socials & content
- Lilypad Learn Video's
- CODwg

November

- LLM LoRA module
- Incentivised testnet & program launch

- Further investigations (collateralisation, checking %, jackpots)

- Labweek (Istanbul)
- Presentations
- CODwg

More...

- Video Module
- 2D -> 3D Module
- Mainnet on IPC

- Further investigations (taxes, prediction markets)
-Further investigations (staking behind other nodes, reputation)
- Switching from trusted mediators to a consortium

⚒️ Lilypad Engineering Update

🧽 Cleaning up

We launched the Testnet launch two weeks ago. Anyone who’s ever done a release under time pressure will know, there’s often a lot to clean up after! Also known as paying down some technical debt, so that is our theme currently…

🧪 Automated testing

We managed to get the testnet out without any automated testing or automated release / deployment process. This is not sustainable as we build out the project for long term success. For sustainability, you need automated testing so that you know (without manually testing every combination of modules every time) whether you’ve regressed when you make a code change!

We plan to make a lot of major changes to the codebase in the coming months, so we’re initially targeting integration or end-to-end tests, which test the behavior of the system via the external interface without relying on the internals. This makes making major changes to the internals easier, while still allowing correctness to be verified.

The initial automated tests are here: https://github.com/bacalhau-project/lilypad/blob/main/test/integration_test.go although we are currently facing flakiness because of issues using IPFS in the CI environment, which will be resolved by implementing our HTTP publisher. We are going with CircleCI because it’s the least bad option in our experience, and offers GPU runners.

🌐 HTTP publisher

By publishing via HTTP directly from the compute node as well as making results available via IPFS CID, we can make the tests more reliable, as well as enabling the Filecoin data prep use case, which will require HTTP streaming to make available large amounts of data as efficiently as possible.

💽 Data prep

We now have Filecoin data prep working inside Lilypad!

This will take an IPFS input CID containing some data, and output it to an IPFS output CID containing the car file necessary to onboard the data to Filecoin:

Next steps on dataprep are to add support for reading data from S3, and exposing the resulting data and metadata via HTTP so that SPs can stream it in directly (with cryptographically secure checksums on both ends).

💊 IPC

Interplanetary Consensus is our pathway to handling real funds with a fast blocktime and low gas fee which also enables high performance on-chain scheduling.

We continue to test and iterate with the IPC team on issues, the latest of which is seemingly lack of filter support in IPC or maybe FVM or the underlying Lotus implementation of Filecoin:

🧠 GPU scheduling

We have a WIP branch for only scheduling GPU jobs to GPU nodes, currently GPU jobs can get accidentally scheduled to CPU-only nodes.

✅ Verification

We have a basic version of verification working in the CLI, but need to test it further and maybe record a nice demo video of it working

🎓 Lilypad Research Update

Research is now splitting into two main branches: integrating the existing research into the codebase, and incorporating ideas from the continuing literature review into the existing research.

On the integrating with engineering front, we primarily focused on the structure of the two-sided marketplace. This includes the structure of resource offers (coming from compute nodes) and job offers (coming from clients), the structure of the smart contract (and making it amenable for future upgrades), the way in which messages are transported between nodes in the network, in particular to enable negotiation between autonomous agents, the structure of the mediation protocol, timeout collateral and slashing (which is separate from anti-cheating collateral and slashing), the pricing of different modules (like SDXL and dataprep) and arbitrary WASM modules, and a roadmap starting from a Modicum-like codebase (which was used to get us to testnet) and ending in a fully decentralized marketplace.

On the literature review front, we are ramping down research into anti-cheating mechanisms, and branching into the structure of negotiation between autonomous agents, distributed scheduling, and container orchestration. These three topics are very much interconnected with each other, as well as with anti-cheating mechanisms, which makes this task very engaging. For example, if a client wants to schedule a job with more than one compute node, what is the best way of doing this? Is it best for the client to negotiate with each compute node separately? Or can there be some central node that coordinates the actions of multiple compute nodes? If the latter, do some nodes have to be on standby (depends on the use case), and if so, how does that impact the cost? On top of all of that, how do we make sure that the nodes are being honest, especially if they are effectively forming compute clusters and are impacted by each others’ performances? This is just scratching the surface of the problems of this exciting and dense topic.

Levi also appeared at the Crypto x AI Mini conf. Keep an eye out for the recording at the Valory YouTube: https://www.youtube.com/@valoryag/videos

🌟 Lilypad "All the Things" Update

🎓 Educational!

Based on The Filecoin Unleashed Presentation from Paris, we've released a short thread explaining Lilypad v1!

Check it out on our Twitter! (no it's not called X. That means close. Elon is cuckoo)

If you're around Melbourne, Alison Haire is also hosting a live meetup with Chainlink at RMIT next week.

Decentralised Oracles and Data Management: Filecoin x Chainlink

📅 Thursday, 17 August 2023 (5:00 pm to 8:00 pm AEST)
🎫 Register here: https://www.meetup.com/en-AU/web3-melbourne/events/294657106/

📺 CODwg Live Jam Session 2

Compute Over Data WG with HackFS winners Pablo from DefiKicks & Cem from DYieldCollector was held. If you missed it, you can catch up on the video on YouTube here.

P.S. If you have a project you want to share - get in touch!

🎭 Project Showcase

We've added a project showcase to the docs to give you some ideas and inspiration for how to use Lilypad for your next project and celebrate the early builders working with us!

Check out all the projects here!

👩‍💻 Open Data Hack - Apply now

Speaking of hackathons, we're supporting the Encode x Filecoin Open Data Hack!

💡
Registrations are open now!

📅 30 August - 20 September

🌐 https://www.encode.club/open-data-hack

💰 $6000 Bounty (Lilypad Integrations!)

🔮 What's Next?

We're working on publishing both an LLM inference module & a Filecoin Data Prep Module in the upcoming fortnight, as well as hoping to release Larana - the IPC Lilypad Testnet.

We're also busy putting together supporting docs and content for users to try out Lilypad, including a try.lilypad.tech website and some fully integrated examples for use.

The Research Simulator is also currently being built to test a variety of possible outcomes for the network game theory with some areas of

☎️ Contact Us

💬 Chat to us on Slack: bit.ly/bacalhau-project-slack #bacalhau-lilypad

Did you find this article valuable?

Support Lilypad Network by becoming a sponsor. Any amount is appreciated!