Hey everyone! I'm curious about what cutting-edge AI tools are currently available that can enhance Jupyter Notebook usage, especially for exploratory data analysis (EDA). Given how AI is revolutionizing software development, I wonder if similar innovations are happening in the data science field. I've done some preliminary research and found GitHub Copilot's agent mode and Cursor as the two leading options right now. However, my previous experience using VSCode with Jupyter was not great due to issues with cell editing. It seems things have improved since then, as it can now execute and create cells properly.
For an AI agent to be really efficient, it must be able to execute notebook cells one by one and access variable memory freely. I'm looking for recommendations on popular tools that meet these requirements and can help data scientists be more productive with AI. Thanks a lot!
3 Answers
You might want to check out Marimo instead of sticking with Jupyter. Marimo offers more possibilities with integrations and stores the notebooks as Python files, which could be useful for LLMs to interact with.
Honestly, the file format shouldn’t matter much for AI since it mainly interfaces with the notebook software to modify cells rather than editing file text directly. Marimo's AI integrations are likely beneficial, but the format isn’t what makes it effective.
From my experience, Marimo can be a real headache, especially with its changing API over the last year and a half. I've found that LLMs struggle to work with it effectively. Instead, I utilize Jupyter cell mode in VSCode, which integrates smoothly with both inline tools and CLI LLM applications.
I appreciate the insight! Sounds like sticking to Jupyter might be the safer route for now.
I’ve found GitHub Copilot to be pretty effective, but I wouldn’t exactly call it cutting-edge AI. It’s useful for generating code snippets, but it doesn’t fully understand the context or memory like an ideal agent would.
Does it actually execute commands to access memory? Like, can it do 'print(variable_name)' and utilize the output for decision-making?

Is Marimo widely adopted though? I don’t want to latch onto something niche. I had similar thoughts about Jupytext, which syncs notebooks and Python files, but I've never tried it.