33/where I dust off the weekly

Created: Nov 20, 2023 by Pradeep Gowda Updated: Dec 12, 2023 Tagged: weekly

Monday, 2023-11-13 to Sunday, 2023-11-19

While reading arne.me’s website, and specifically the weekly section, I re-realized that the weekly (newsletter) format works really well to capture interesting things and make a note of why and how I found it interesting.

Journal

Visited Krishimela 2023 (see gallery) on Saturday Nov 18. It took 2:40 hours to get back from the venue back home. Weekend traffic in Bengaluru is terrible. Best keep travel to weekedays, when feasible, away from busy hours (as few as they are).

Visited māḷigēgauḍanadoḍḍī in rāmanagara taluk to attend the gr̥hapravēṣa of a family member on Sunday Nov 19.

India lost the Men’s Cricket World 2023 to Australia on Nov 19 after a stellar league and Semi-final performance. I lost interest after the early collapse of batting, and as soon as the score appeared to miss 300.

Thoughts

Clarity, of thought and action, is something one can develop. Just the prompt “Do you have clarity” has a magical effect of making you present to the current state of mind, and action.

Someone was recently complimented as:

…rare combination of humility, work ethic, creativity, and conviction in his ideas. Yes, this is a great combination that can be developed.
…a true uplifter, and a immense value creator…

Was reminded of this book - Ego is the enemy and picked it up for reading again.

GPT and Generative AI

Text to Speech

Whisper

Whisper Large-v3: New champion for the Open ASR leaderboard

The best performance for Whisper checkpoints is obtained when the language/ task is set explicitly. i.e. it’s inbuilt. Language Identification can be a hit or miss, especially on accented audio.

Insanely fast whisper

2.5h of audio transcribed in under 2 minutes.

Cool workflow idea about how to use this by Kam

I do this with the Whisper API using Pipedream to summarize YouTube videos into my workspace. A summary plus actionable takeaways, all with a button click. It’s magic ✨. Changed my approach to longform audio, as I can decide if it’s worth the listen/skim of the transcript before the fact. I can then turn takeaways into todos and link them to similar notes.

Local LLMs

Zephyr 7B

ChatGPT Plus paused signups, so we made it easy to chat with open source models. Zephyr 7B Beta outperforms GPT 3.5 + Llama 2 70B and all 7B models – Chatbot (jupyter notebook).

why you may want to take a look at Mistral 7B finetune deployed with VLLM

There’s no point in using GPT-3.5 in production — the API has frequent downtime, plus high response latencies lead to subpar end user experiences. Instead, I’ve found a Mistral 7B finetune deployed with VLLM to be 10x faster(!), way cheaper, and stupidly simply to self-host. For the types of product features that dramatically benefit from a better model — GPT-4 is still undefeated. But for my use cases a much tinier (and faster) model results in a better UX. 10x faster as in we get a response from our vllm deployment in ~200ms compared to sometimes ~2 sec from OpenAI. using fp16 on an a100

we find it (Azure) to be significantly faster than openai and so far perfectly reliable. I hear that it’s quite dependent on region though, as well as time of day (although I haven’t ran those benchmarks myself)

Hardware = fp16 on an a100

The zephyr one and some uncensored ones (like dolphin2.2.1-mistral7b) work extremely well.

LMStudio.ai

LM Studio - Discover, download, and run local LLMs recommended by BarneyFlames. The other option is ollama, which I have installed. I used ollama to download and run mistral locally on my M2 MBP mu.

Through installing LMStudio and playing around, I found about GGUF.

GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

The above page also mentions other tools like Faraday.dev, LostRuins/koboldcpp: A simple one-file way to run various GGML and GGUF models with KoboldAI’s UI, lollms-webui, and oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models. (github.com).

I asked ChatGPT and locally running Mistral 7b a question: “how does one narrow down choices on what to work on for the rest of their life?”. See the comparison

Developer Tooling

Came across Jetpack.io, when reading this article about devbox. They have three OSS products:

Devbox: Build reliable and consistent dev environments without Docker. Uses Nix to create isolated, reproducible development environments for any workflow.
Launchpad: Deploy any project to Kubernetes, hassle free
Nixhub: Search over 400,000 package versions in one place

Faraday

Chat with AI Characters Offline. Runs locally. Zero-configuration. Own your data - The AI models that power Faraday are stored 100% locally on your computer. No one can change the behavior of your Characters, revoke access, or remove your data. Private – Chat data is saved to your computer and is never sent to a remote server. You can opt-in to share specific chats to help us improve models, but it is never required.

General Interest

Why I spent 3 years working on a coat hanger - Simone Giertz – a great little intro into the “hacker spirit”, the enthusiasm and perseverance it takes to build.

How mathematics built the modern world from Works in Progress online magazine.

Programming

DuckDB

If you aren’t already following Simon Willison on twitter, you should. He consistently produces great programming material – python for the longest time, creator of the datasette project, and recently duckdb and GPT.

He is very productive, and publishes many TILs (Today I learnt), the latest of which shows how to use duckdb to query remotely hosted parquet files to answer a query, by fetching only the subset required. Summing columns in remote Parquet files using DuckDB

The Lambda example is particularly novel:

SELECT
    SUM(size) AS size
FROM read_parquet(
    list_transform(
        generate_series(0, 55),
        n -> 'https://huggingface.co/datasets/vivym/midjourney-messages/resolve/main/data/' ||
            format('{:06d}', n) || '.parquet'
    )
);

Python

I heard from a coworker that pip-tools is encouraged over poetry. This blog post – Boring Python: dependency management goes into the use of pip-tools. I still haven’t understood why not use poetry.

Technology

resend – Email for developers. The best API to reach humans instead of spam folders. Build, test, and deliver transactional emails at scale. via Harish Garg

pipedream is workflow automation for developers. (see below for example usage).

Farming

This week I’m researching Cassia Tora as a cover-crop on the recommendation of Hari

Cover Crops: An idea worth planting?

See my album from my visit to krishimela 2023 - has pictures of the plants with signs with names in English, Kannada, scientific names, uses etc.

From around the web

I was thinking about Atari Email Archive - A collection of messages sent at Atari from 1983 to 1992. a few days ago, and I didn’t remember to which company they belonged to (Atari). Found them via Vivek Oberoi while reading his Roaring bitmaps post (see above).