Computer Related Bookmarks
Computer technology related.
Metalinks
- https://github.com/pfalcon/awesome-python-compilers - Retrospective of Python compilation efforts
- https://github.com/daturkel/learning-papers landmark papers in machine learning
- https://github.com/trickest/cve Almost every publicly available CVE PoC.
- https://github.com/danluu/post-mortems - A collection of postmortems. Sorry for the delay in merging PRs!
- https://github.com/public-apis/public-apis - A collective list of free APIs
- https://github.com/sw-yx/spark-joy Easy ways to add design flair, user delight, and whimsy to your product!
Reference, Academic Stuff, etc.
- ☆☆ https://web.stanford.edu/class/cs168/index.html - Extremely useful computer science for the out-of-date practitioner (backed up locally)
- ☆ https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html - Undefined behavior in C
- ☆ http://karpathy.github.io/2015/05/21/rnn-effectiveness/ - The Unreasonable Effectiveness of Recurrent Neural Networks
- ☆ https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
- ☆ https://github.com/AnthonyCalandra/modern-cpp-features
- ☆ https://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
- ☆ https://jepsen.io/analyses - Jepsen has analyzed over two dozen databases, coordination services, and queues—and we’ve found replica divergence, data loss, stale reads, read skew, lock conflicts, and much more
- ☆ https://www.quora.com/What-goes-into-making-an-OS-to-be-Unix-compliant-certified
- ☆ https://web.archive.org/web/20220303141223/https://pastebin.com/n6AGB62L An ex-Microsoft employee’s overview of Win32, WinForms, WPF, and UI toolkits in general
- ☆ https://github.com/kdeldycke/awesome-falsehood A curated list of falsehoods programmers believe in.
- ☆ https://www.youtube.com/watch?v=Kv1Hiv3ox8I How are Images Compressed? [46MB ↘↘ 4.07MB] - Jpeg compression
- ☆ https://betterdev.blog/minimal-safe-bash-script-template/
- ☆ https://www.matuzo.at/blog/html-boilerplate/
- ☆ https://soatok.blog/2022/12/29/what-we-do-in-the-etc-shadow-cryptography-with-passwords/
- ☆ https://www.cs251.com/ - Great Ideas in Theoretical Computer Science
- ☆ https://www.youtube.com/watch?v=mZck0N_T9Cs - And this year’s Turing Award goes to… Nisan–Wigderson PRNG
- http://pubs.opengroup.org/onlinepubs/000095399/ - POSIX Specification
- http://www.unicode.org/cgi-bin/UnihanRSIndex.pl?radical=159&minstrokes=4&maxstrokes=6&useutf8=true - Unihan Radical-Stroke Index for Radical #159
- http://research.microsoft.com/apps/pubs/default.aspx?id=144888 - Cycles, Cells and Platters: An Empirical Analysis of Hardware Failures on a Million Consumer PCs
- https://blog.cloudflare.com/keyless-ssl-the-nitty-gritty-technical-details/ - Keyless SSL: The Nitty Gritty Technical Details
- http://www.cs.cornell.edu/home/sam/FDpapers.html - Unreliable Failure Detectors for Reliable Distributed Systems. (No longer available. Can be found on archive.org)
- http://blog.foundationdb.com/databases-at-14.4mhz - Optimizing transaction time
- http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=422994DDDEE5D01D4B4340533A5034CF?doi=10.1.1.680.2097&rep=rep1&type=pdf - On the Euclidean Distance of Images
- http://www.math.uwaterloo.ca/~bico/papers/match_ijoc.pdf - Minimum weighted matching
- https://courses.engr.illinois.edu/cs598csc/sp2010/lectures/lecture10.pdf - Minimum weighted matching
- http://math.mit.edu/~goemans/18433S09/matching-notes.pdf Lecture notes on bipartite matching
- https://www.princeton.edu/~chiangm/optimization.pdf - Convex Optimization and Lagrange Duality
- https://arxiv.org/pdf/1801.00173.pdf - Theory of Deep Learning III: explaining the non-overfitting puzzle
- https://exploringjs.com/deep-js/toc.html - Deep Javascript (Book)
-
https://talawah.io/blog/extreme-http-performance-tuning-one-point-two-million/ - Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance talawah.io - https://danluu.com/new-cpu-features/ - What’s new in CPUs since the 80s?
- https://danluu.com/file-consistency/ - File consistency and crash/power outage resilience
- https://isocpp.org/faq - Super-FAQ : Standard C++
- http://yosefk.com/c++fqa/ - C++ Frequently Questioned Answers
- https://blog.ffwll.ch/2017/08/github-why-cant-host-the-kernel.html Why Github can’t host the Linux Kernel Community (A very informative critique of what’s missing in GitHub for massively distributed OSS projects)
- https://www.youtube.com/watch?v=ncHmEUmJZf4 CppCon 2017: Matt Kulukundis “Designing a Fast, Efficient, Cache-friendly Hash Table, Step by Step”
- https://sadh.life/post/builtins/ - Python “Builtins”
- https://swtch.com/~rsc/regexp/regexp4.html Regular Expression Matching with a Trigram Index or How Google Code Search Worked Russ Cox
- https://gto76.github.io/python-cheatsheet/ - Comprehensive Python Cheatsheet
- https://news.ycombinator.com/item?id=29458900 De-google HN discussion and links
- https://data.research.cornell.edu/content/readme - Guide to writing “readme” style metadata
- https://www.daemonology.net/blog/2018-01-17-some-thoughts-on-spectre-and-meltdown.html#disqus_thread - Best Summary of Spectre and Meltdown I’ve read so far.
- https://blog.frankmtaylor.com/2021/10/21/a-small-guide-for-naming-stuff-in-front-end-code/ - A Small Guide for Naming Stuff in Front-end Code – Frank M TaylorUnited HatzalahMagen David Adom in Israel – מגן דוד אדום בישראל
- https://www.quora.com/Why-do-some-developers-at-strong-companies-like-Google-consider-Agile-development-to-be-nonsense - Why do some developers at strong companies like Google consider Agile development to be nonsense? - Quora
- https://github.com/Bakhtiyar-Garashov/flexbox-101 A guide contains everything you need to know about CSS flexbox
- https://gankra.github.io/blah/text-hates-you/ (All the edge cases of rendering text)
- https://lord.io/text-editing-hates-you-too/ Even more problems with editing text
- https://portswigger.net/research/http2 Security problems with HTTP/2
- https://github.com/dylanaraps/pure-bash-bible - 📖 A collection of pure bash alternatives to external processes.
- http://mywiki.wooledge.org/BashPitfalls - BashPitfalls - Greg’s Wiki
- https://sethmlarson.dev/blog/utf-8 UTF8 reference
-
https://hackingcpp.com/cpp/cheat_sheets.html - C++ Cheat Sheets & Infographics hacking C++ - https://news.ycombinator.com/item?id=30596699 What Is IO Monad? (2018) [video] (youtube.com)
- https://en.algorithmica.org/hpc/ This is an upcoming high performance computing book titled “Algorithms for Modern Hardware” by Sergey Slotin.
- https://news.ycombinator.com/item?id=31557809&utm_term=comment ffmpeg parameters and incantations
- https://mariadb.com/bsl11/ Business Source License 1.1
- https://www.youtube.com/watch?v=7aONIVSXiJ8 Introduction to memory management in Linux
- https://til.simonwillison.net/sqlite/one-line-csv-operations One-liner for running queries against CSV files with SQLite
- https://scottaaronson.blog/?p=208 Shor, I’ll do it: explaining Shor’s algorithm without using a single ket sign
- https://github.com/nepx/halfix Halfix is a portable x86 emulator written in C99. (May be a good reference for x86 cpu/devices)
- https://github.com/SuperDisk/tar.pl A tar creator and extractor in ~100 lines of Prolog
- https://en.wikipedia.org/wiki/HyperLogLog HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset.
- https://sigpipe.macromates.com/2020/macos-catalina-slow-by-design/ executable checksums sent for checking
- https://support.google.com/faqs/answer/7625886 - Retpoline
- https://en.wikipedia.org/wiki/Flajolet%E2%80%93Martin_algorithm - The Flajolet–Martin algorithm is an algorithm for approximating the number of distinct elements in a stream with a single pass and space-consumption logarithmic in the maximal number of possible distinct elements in the stream (the count-distinct problem).
- https://gist.github.com/kconner/cff08fe3e0bb857ea33b47d965b3e19f https://news.ycombinator.com/item?id=35847715 - MacOS Internals (Historical)
- https://preshing.com/20201210/flap-hero-code-review/ Flap Hero is a small game written entirely in C++ without using an existing game engine. All of its source code is available on GitHub. I think it can serve as an interesting resource for novice and intermediate game developers to study.
- https://github.com/arc80/FlapHero - A small C++ game built using Plywood
- ☆ https://www.quantamagazine.org/why-mathematicians-cant-find-the-hay-in-a-haystack-20180917/ - A really good summary of complexity theory
- https://en.algorithmica.org/hpc/ - performance computing book titled “Algorithms for Modern Hardware” by Sergey Slotin. Its intended audience is everyone from performance engineers and practical algorithm researchers to undergraduate computer science students who have just finished an advanced algorithms course and want to learn more practical ways to speed up a program
- ☆ https://en.wikipedia.org/wiki/Post_correspondence_problem - A rather surprising undeciable result.
- https://computerhistory.org/blog/adobe-photoshop-source-code/
- https://developers.google.com/machine-learning/glossary/
Open Source Tools
- ☆ https://github.com/enigo-rs/enigo?tab=readme-ov-file - Cross platform input simulation in Rust
- https://news.ycombinator.com/item?id=42158311
- ☆ http://sourcehut.org - sth like github but created by OSS maniacs
- ☆ https://datasette.io/ An open source multi-tool for exploring and publishing data
- ☆ https://github.com/geohot/tinygrad - Small neural network tool
- ☆ https://github.com/yt-dlp/yt-dlp (youtube-download - fork)
- ☆ https://www.raylib.com/index.html - raylib is a simple and easy-to-use library to enjoy videogames programming.
- ☆ https://love2d.org/ - LÖVE is an awesome framework you can use to make 2D games in Lua
- ☆ https://github.com/susam/texme - TeXMe is a lightweight JavaScript utility to create self-rendering Markdown + LaTeX documents.
- ☆ https://news.ycombinator.com/item?id=26523212 ZPL: (Almost) C99 Powerkit (github.com/zpl-c) - Nice basic C library
- ☆ https://js13kgames.github.io/resources/
- ☆ https://boardgame.io/ Open Source Game Engine for Turn-Based Games
- ☆ https://jodd.org/ The Unbearable Lightness of Java - Jodd is a set of micro-frameworks and developer-friendly tools and utilities.
- ☆ https://github.com/eudoxia0/cmacro cmacro: Lisp macros for C
- ☆ https://kindavim.app/ - VIM accessibility bindings for macOS
- ☆ https://litestream.io/alternatives/cron/ / https://github.com/benbjohnson/litestream
- ☆ https://github.com/pretzelai/pretzelai - Pretzel is an open-source, offline browser-based tool for fast and intuitive data exploration and visualization. It can handle large data files, runs locally in your browser, and requires no backend setup.
- ☆ https://github.com/ViNeek/wuhoo - Wuhoo loosely stands for W indows U sing H eaders O nly. It is an attempt to create a single-header library (in the spirit of STB [1]) for graphics related window management, compatible with both C and C++.
- Probably great for rendering Mandelbrot!
- ☆ https://github.com/huggingface/candle - Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use
- https://github.com/EricLBuehler/mistral.rs
- ☆ https://en.wikipedia.org/wiki/OBS_Studio
- ☆ https://rqlite.io/ - rqlite is a distributed relational database that combines the simplicity of SQLite with the robustness of a fault-tolerant, highly available cluster
-
☆ https://github.com/xmake-io/xmake - Xmake = Build backend + Project Generator + Package Manager + [Remote Distributed] Build + Cache - ☆ https://whattheduck.incentius.com/
- https://github.com/incentius-foss/WhatTheDuck
- ☆ https://docs.pygfx.org/stable/index.html Pygfx (pronounced “py-graphics”) is built on wgpu, enabling superior performance and reliability compared to OpenGL-based solutions.
- ☆ https://www.ironcalc.com/ - Open Source Online Spreadsheet
- https://www.brow.sh/ - Browsh is a fully-modern text-based browser
- http://xapian.org/docs/bindings/python/ - Xapian is an Open Source Search Engine Library, released under the GPL v2+. It’s written in C++, with bindings
- http://openrefine.org/ - Data analyzing and cleansing tool
- http://ankisrs.net/ - Make remembering things easy
- HazelCast - Distributed in memory data grid
- http://qira.me/ QIRA is a timeless debugger - All state is tracked while a program is running, so you can debug in the past. - Linux required, 64-bit Ubuntu recommended.
- https://github.com/codelucas/newspaper - newspaper scraping framework
- https://github.com/octol/vim-cpp-enhanced-highlight - Additional Vim syntax highlighting for C++ (including C++11/14/17)
- https://teuder.github.io/rcpp4everyone_en/index.html - Rcpp is a package that enables you to implement R functions in C++.
- https://github.com/abseil/abseil-cpp - Google’s C++ Library ported to Open Source
- https://github.com/asd5510/word2vec-chinese-demo - my chinese word2vec visulization demo, using chinese wiki as corpus
- http://stackoverflow.com/questions/1777060/what-linux-full-text-indexing-tool-has-a-good-c-api - What Linux Full Text Indexing Tool Has A Good C++ API? - Stack Overflow
- https://github.com/thoth-station/thoth - Python recommendation engine for python (ML) packages (so meta..)
- https://github.com/beurtschipper/Depix - Depix is a tool for recovering passwords from pixelized screenshots.
- https://www.luart.org/ - Comprehensive Windows framework to develop in Lua
- https://seb.jambor.dev/posts/improving-shell-workflows-with-fzf/ - Improving shell workflows with fzf - Sebastian Jambor’s blog
- https://ptsjs.org/ - A really impressive and simple Javascript visualization library
- http://zsync.moria.org.uk/ zsync is a file transfer program. It allows you to download a file from a remote server, where you have a copy of an older version of the file on your computer already. zsync downloads only the new parts of the file.
- https://github.com/google/fully-homomorphic-encryption - An FHE compiler for C++
- https://github.com/szhorvat/ConnectedGraphSampler Really Random Connected Graphs
- https://www.falstad.com/circuit/ electronic circuit simulator
- https://catala-lang.org/ Catala Lang: DSL designed for deriving implementations from legislative texts
- https://github.com/BurntSushi/ripgrep/issues/1497 - RFC: add ngram indexing support to ripgrep · Issue #1497 · BurntSushi/ripgrep · GitHub
- https://github.com/babysor/MockingBird Mocking Bird – Realtime Voice Clone for Chinese
- https://github.com/PollRobots/scheme An R7RS Scheme implemented in WebAssembly
- https://formulae.brew.sh/formula/tcpflow https://linux.die.net/man/1/tcpflow tcpflow is a program that captures data transmitted as part of TCP connections (flows), and stores the data in a way that is convenient for protocol analysis or debugging. A program like tcpdump(4) shows a summary of packets seen on the wire, but usually doesn’t store the data that’s actually being transmitted. In contrast, tcpflow reconstructs the actual data streams and stores each flow in a separate file for later analysis. tcpflow understands TCP sequence numbers and will correctly reconstruct data streams regardless of retransmissions or out-of-order delivery.
- https://typesense.org/about/ Typesense is an open source, typo tolerant search engine that is optimized for instant sub-50ms searches, while providing an intuitive developer experience.
- https://github.com/akkartik/teliva an environment for end-user programming “Enable all people to modify the software they use in the course of using it.”
- https://github.com/robmsmt/SpeechLoop toolkit to evaluate many different speech recognition engines.
- https://en.wikipedia.org/wiki/SipHash SipHash is an add–rotate–xor (ARX) based family of pseudorandom functions created by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012,[1]: 165 [2] in response to a spate of “hash flooding” denial-of-service attacks (HashDoS) in late 2011.[3]
- https://godotengine.org/ - Open Source Game Engine - Godot provides a huge set of common tools, so you can just focus on making your game without reinventing the wheel. (Engine for big games)
- https://sqlite-utils.datasette.io/en/stable/index.html https://github.com/simonw/sqlite-utils/
- https://htmx.org/ htmx gives you access to AJAX, CSS Transitions, WebSockets and Server Sent Events directly in HTML, using attributes, so you can build modern user interfaces with the simplicity and power of hypertext htmx is small (~10k min.gz’d), dependency-free, extendable & IE11 compatible
- https://github.com/confluentinc/ksql - ksqlDB is a database for building stream processing applications on top of Apache Kafka
- https://heaps.io/ Heaps.io is a mature cross platform graphics engine designed for high performance games.It is designed to leverage modern GPUs that are commonly available on both desktop and mobile devices. (Based on Haxe)
- https://github.com/facebookexperimental/eden EdenSCM is a cross-platform, highly scalable source control management system.
- https://github.com/codemix/deprank Deprank uses the PageRank algorithm to find the most important files in your JavaScript or TypeScript codebase.
- https://github.com/fivethirtyeight - small datasets that might be useful (mostly US-centric social/politics/sports stuff)
- https://github.com/ahilss/wxWidgets-wasm https://github.com/ahilss/portaudio-wasm https://github.com/ahilss/wavvy
- https://github.com/corkami/collisions - Hash collisions and exploitations
- https://tsoding.org/olive.c/ Olive.c is a simple graphics library that does not have any dependencies and renders everything into the given memory pixel by pixel.
- https://github.com/MichaelMure/git-bug Bug Tracker implemented with git
- https://github.com/charmbracelet/glow terminal based markdown reader
- https://github.com/Immediate-Mode-UI/Nuklear – A single-header ANSI C immediate mode cross-platform GUI library
- https://github.com/justjake/Gauss A Stable Diffusion app for macOS built with SwiftUI and Apple’s ml-stable-diffusion CoreML models.
- https://duckdb.org/ DuckDB is an in-process SQL OLAP database management system (SELECT * FROM ‘myfile.csv’)
- https://github.com/google/j2objc - A Java to iOS Objective-C translation tool and runtime.
- https://github.com/red-data-tools/YouPlot - YouPlot is a command line tool that draws plots on the terminal.
- https://github.com/oobabooga/text-generation-webui - web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA - goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation
- https://github.com/PyO3/pyo3 - Rust bindings for Python
- https://github.com/nidhaloff/deep-translator - integrate support for multiple famous translators in this tool
- https://github.com/vlcn-io/cr-sqlite cr-sqlite adds multi-master replication and partition tolerance to SQLite via conflict free replicated data types (CRDTs) and/or causally ordered event logs.
- https://github.com/chathub-dev/chathub/blob/main/README.md - ChatHub is an all-in-one chatbot client
- https://github.com/cotowali/cotowali A statically typed scripting language that transpile into POSIX sh
- https://github.com/WebSheets - Javascript implementation of Excel functions
- https://github.com/byronka/minum - Minimum Java Web (HTTP) Framework
- https://github.com/sickcodes/Docker-OSX - Run macOS VM in a Docker! Run near native OSX-KVM in Docker! X11 Forwarding! CI/CD for OS X Security Research! Docker mac Containers.
- https://github.com/lmorg/murex - Support for additional type information in pipelines, which can be used for complex data formats like JSON or tables. Meaning all of your existing UNIX tools to work more intelligently and without any additional configuration.
- https://github.com/wireservice/csvkit - A suite of utilities for converting to and working with CSV, the king of tabular file formats.
-
https://exple.tive.org/blarg/2023/02/17/modern-problems-require-modern-solutions/ - Modern Problems Require Modern Solutions blarg -
https://www.visidata.org/ - Open-source data multitool VisiData - https://lima-vm.io - launches Linux virtual machines with automatic file sharing and port forwarding (probably the go to tool for MacOS Linux VMs now?)
- https://github.com/uutils/coreutils - Rust rewrite of GNU coreutils
- https://github.com/straussmaximilian/ocrmac - small Python wrapper to extract text from images on a Mac system. Uses the vision framework from Apple
- https://github.com/ttscoff/curlyq - A command line helper for curl and web scraping
- https://github.com/jasonjmcghee/rem - locally record everything you view on your Apple Silicon computer
- https://github.com/electric-sql/pglite - PGlite - Postgres in WASM
- https://github.com/messense/jieba-rs - The Jieba Chinese Word Segmentation Implemented in Rust
- https://github.com/chearon/dropflow - Canvas text layout engine.
- https://github.com/mitchellh/libxev - Zig/C event loop
- https://github.com/paul-gauthier/aider - LLM code assistant tool
- https://github.com/benibela/xidel - Extracting data from HTML
- https://mise.jdx.dev/how-i-use-mise.html - Standardized CLI environments
- https://github.com/gelisam/hawk - Similar to awk, but using Haskell as the text-processing language.
- https://github.com/amber-lang/amber - Programming language that compiles to Bash. It’s a high level programming language that makes it easy to create shell scripts.
- https://github.com/skeeto/w64devkit - Some better mingw32 packaged for modern windows
- https://github.com/arunsupe/semantic-grep (Grep using word2vec or something…)
- https://ui.shadcn.com/ - Beautifully designed components that you can copy and paste into your apps.
- https://pyspread.gitlab.io - goal of pyspread is to be the most pythonic spreadsheet
- https://github.com/tidwall/bgen - Bgen is a B-tree generator for C. It’s small & fast and includes a variety of options for creating custom in-memory btree based collections.
- https://github.com/lukas-blecher/LaTeX-OCR
- https://github.com/AnswerDotAI/gpu.cpp - GPGPU on WebGPU spec as a portable low-level GPU interface
Json
- https://github.com/ynqa/jnv
- https://www.oilshell.org/
- https://www.pgrs.net/2024/03/21/duckdb-as-the-new-jq/
- https://dt.plumbing/
- https://www.nushell.sh/
- https://github.com/jhspetersson/fselect
- https://news.ycombinator.com/item?id=40864541
- https://news.ycombinator.com/item?id=41482661
Machine Learning
Theory, Fundamentals, Documentation, Tutorials, Guides
- ☆ https://agi.safe.ai/ - Humanity’s Last Exam is a challenging, multi-modal benchmark designed to rigorously test the limits of large language models across a broad range of subjects.
- ☆ https://evjang.com/2021/10/23/generalization.html
-
https://towardsdatascience.com/choosing-the-right-gpu-for-deep-learning-on-aws-d69c157d8c86 - Choosing the right GPU for deep learning on AWS by Shashank Prasanna Towards Data Science - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html - Install NVIDIA drivers on Linux instances - Amazon Elastic Compute Cloud
- https://github.com/openai/gpt-2/blob/master/src/model.py
- http://www.nervanasys.com/demystifying-deep-reinforcement-learning/ - Guest Post (Part I): Demystifying Deep Reinforcement Learning - Nervana
- https://www.youtube.com/watch?v=aircAruvnKk Good intro to neural network
- http://kvfrans.com/coloring-and-shading-line-art-automatically-through-conditional-gans/ - Machine Learning how to fill in manga art
- http://karpathy.github.io/neuralnets/ - Hacker’s guide to Neural Networks
- http://colah.github.io/posts/2015-08-Understanding-LSTMs/ - Understanding LSTM Networks – colah’s blog
- https://en.wikipedia.org/wiki/Cosine_similarity In data analysis, cosine similarity is a measure of similarity between two sequences of numbers.
- https://www.youtube.com/watch?v=e9U0QAFbfLI Youtube explanation of Cosine Similarity by StatQuest
- https://beergameai.github.io/ - Playing the Beer Game Using Reinforcement Learning
- https://github.com/dair-ai/Prompt-Engineering-Guide - 🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
- https://www.youtube.com/watch?v=YCzL96nL7j0 LSTM Explained
- https://www.youtube.com/watch?v=kCc8FmEb1nY Let’s build GPT: from scratch, in code, spelled out. Andrej Karpathy
- https://twitter.com/alexalbert__/status/1636488551817965568 the first jailbreak for ChatGPT-4
- HN discussion https://news.ycombinator.com/item?id=35190383
-
https://www.youtube.com/watch?v=nctqc8FBJ2U - geohotz live programming - George Hotz Programming ChatLLaMA: get in losers we’re building a chatbot -
https://medium.com/@imicknl/how-to-create-a-private-chatgpt-with-your-own-data-15754e6378a1 - How to create a private ChatGPT with your own data by Mick Vleeshouwer Medium - https://simonwillison.net/2023/Mar/17/beat-chatgpt-in-a-browser/ - some useful info on the state of the art (note that the author did admit in https://news.ycombinator.com/item?id=35391717 that the comparison to GPT-3 was a bit of an exaggeration)
- https://react-lm.github.io/ - Some prompt techniques - In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information
- https://til.simonwillison.net/llms/python-react-pattern - Some actual illustration of how to implement react prompting
- https://sharegpt.com/ Share your wildest ChatGPT conversations with one click. 120,245 conversations shared so far.
- This thing is basically abandoned by now - but the conversations data might be available somewhere else
- https://jaykmody.com/blog/gpt-from-scratch/ - GPT in 60 Lines of NumPy
- https://github.com/Crataco/ai-guide - Guide for LLMs
- https://github.com/underlines/awesome-marketing-datascience/blob/master/README.md
- https://www.youtube.com/watch?v=ySEx_Bqxvvo - MIT 6.S191: Recurrent Neural Networks, Transformers, and Attention
- https://www.youtube.com/watch?v=kIiO4VSrivU - MIT 6.S191: Trustworthy Deep Learning
- https://www.youtube.com/watch?v=Fjh1kwOzr7c OpenAI’s GPT-4 Just Got Supercharged!
- https://arxiv.org/abs/2211.09066 Prompting - Teaching Algorithmic Reasoning via In-context Learning
- https://www.make-safe-ai.com/is-bing-chat-safe/ - make-safe-ai/is-bing-chat-safe
- https://arxiv.org/abs/2201.11903 https://arxiv.org/abs/2205.11916 - Chain of thought Prompting
- https://www.youtube.com/watch?v=g2BRIuln4uc - Intuition Behind Self-Attention Mechanism in Transformer Networks
- https://www.youtube.com/watch?v=S27pHKBEp30 - LSTM is dead. Long Live Transformers!
- https://arxiv.org/abs/2305.20010 Human or Not? A Gamified Approach to the Turing Test
-
https://www.youtube.com/watch?v=VcVfceTsD0A - Max Tegmark: The Case for Halting AI Development Lex Fridman Podcast #371 - https://www.youtube.com/watch?v=X7c0T7uwtkM - MARI Grand Seminar - Large Language Models and Low Resource Languages
- https://a16z.com/2023/05/25/ai-canon/ - Some useful links on various AI topics
- https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post - A very dense essay on why RLHF might not work
- https://www.youtube.com/watch?v=wVzuvf9D9BU - GPT 4 is Smarter than You Think: Introducing SmartGPT
- Contains some interesting bits about prompting and how to best utilize GPT-4
- https://huggingface.co/blog/stackllama - StackLLaMA: A hands-on guide to train LLaMA with RLHF
-
https://szopa.medium.com/teaching-chatgpt-to-speak-my-sons-invented-language-9d109c0a0f05 - Teaching ChatGPT to Speak my Son’s Invented Language by Ryszard Szopa Medium - Teaching ChatGPT to Speak my Son’s Invented Language by Ryszard Szopa Medium - https://github.com/huggingface/peft https://arxiv.org/abs/2106.09685 - State-of-the-art Parameter-Efficient Fine-Tuning (PEFT) methods
-
https://www.technologyreview.com/2020/10/16/1010566/ai-machine-learning-with-tiny-data/ - A radical new technique lets AI learn with practically no data MIT Technology Review - https://huggingface.co/blog/4bit-transformers-bitsandbytes - Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA - Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
- https://crfm.stanford.edu/2023/03/13/alpaca.html - Stanford CRFM
- https://crfm.stanford.edu/2023/05/22/alpaca-farm.html - Stanford CRFM
- https://www.youtube.com/watch?v=_njf22xx8BQ - 12 New Code Interpreter Uses
- https://www.paepper.com/blog/posts/how-and-why-stable-diffusion-works-for-text-to-image-generation/ - How and why stable diffusion works for text to image generation :: Päpper’s Machine Learning Blog — This blog features state of the art applications in machine learning with a lot of PyTorch samples and deep learning code. You will learn about neural network optimization and potential insights for artificial intelligence for example in the medical domain.
-
https://www.pinecone.io/learn/vector-database/ - What is a Vector Database & How Does it Work? Use Cases + Examples Pinecone - https://www.semianalysis.com/p/google-we-have-no-moat-and-neither - Google “We Have No Moat, And Neither Does OpenAI”
- https://hazyresearch.stanford.edu/blog/2023-03-27-long-learning - From Deep to Long Learning? (context length)
- https://arxiv.org/abs/2212.07677 https://arxiv.org/abs/2208.01066 - Learning in context
- https://www.youtube.com/watch?v=AsNTP8Kwu80 - Recurrent Neural Networks (RNNs), Clearly Explained!!! StatQuest
- https://developer.apple.com/videos/play/wwdc2023/10042/ - Explore Natural Language multilingual models Learn how to create custom Natural Language models for text classification and word tagging using multilingual, transformer-based embeddings. We’ll show you how to train with less data and support up to 27 different languages across three scripts. Find out how to use these embeddings to fine-tune complex models trained in PyTorch and TensorFlow. For more on Natural Language, check out “Make apps smarter with Natural Language” from WWDC20.
- https://www.youtube.com/watch?v=L_Guz73e6fw - Sam Altman Lex Fridman Interview
- https://twitter.com/emollick - posts very good commentary and news on AI especially LLMs
- https://www.youtube.com/watch?v=tkqD9W5U9F4 - AlphaGo-style Tree Search on Thought Trees in GPT models
- https://github.com/lightvector/KataGo/blob/master/docs/GraphSearch.md - Monte-Carlo Tree Search (MCTS) except applied to directed graphs instead of trees (caching of duplicate nodes)
- http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf - Efficient BackProp
- http://www.incompleteideas.net/IncIdeas/BitterLesson.html - highlights that leveraging computation through general methods is the most effective approach in AI research. Relying on human knowledge and understanding of specific domains often hinders progress
- https://together.ai/blog/tri-dao-flash-attention - Introducing Together AI Chief Scientist Tri Dao, as he releases FlashAttention-2 to speed up model training and inference
- https://huggingface.co/blog/4bit-transformers-bitsandbytes - Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA - Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
- https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/ - 15 times Faster than Llama 2: Introducing DeciLM - NAS-Generated LLM with Variable GQA
- https://microsoft.github.io/generative-ai-for-beginners/#/ - Generative AI for Beginners
- https://www.lesswrong.com/posts/cgqh99SHsCv3jJYDS/we-found-an-neuron-in-gpt-2 - We Found An Neuron in GPT-2
-
https://esteininger.medium.com/building-a-vector-search-engine-using-hnsw-and-cosine-similarity-753fb5268839 - Building a Vector Search Engine Using HNSW and Cosine Similarity by Ethan Steininger Medium -
https://thedataquarry.com/posts/vector-db-1/ - Vector databases (1): What makes each one different? The Data QuarryVector databases (1): What makes each one different? - ☆ https://bernsteinbear.com/blog/compiling-ml-models/ Compiling ML models to C for fun
- https://paperswithcode.com/ -
- https://www.youtube.com/watch?v=9dSkvxS2EB0 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)
- https://cgad.ski/blog/when-gradient-descent-is-a-kernel-method.html
- https://arxiv.org/abs/2305.18290 - Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- https://hugodutka.com/posts/answering-legal-questions-with-llms/ - Ideas on how to do this
- https://www.reddit.com/r/LocalLLaMA/comments/191s7x3/a_simple_guide_to_local_llm_finetuning_on_a_mac/
- https://www.reddit.com/r/LocalLLaMA/comments/1d1bnql/awesome_prompting_techniques/
- https://huggingface.co/blog/rlhf
- https://arxiv.org/abs/2205.14135 - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Libraries / Tools
- ☆ https://aihorde.net/
- ☆ https://annas-archive.org/datasets
- Chinese books: https://annas-blog.org/duxiu-exclusive.html
- https://bellard.org/ - Fabrice Bellard’s Home Pagelibnc/ LibNC is a C library for tensor manipulation. It supports automatic differentiation and can be used to implement machine learning models such as LSTM and Transformers. (Intel+Linux/Win only, not open source)
- https://github.com/google/sentencepiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabulary size is predetermined prior to the neural model training.
- https://nn-512.com/ Generate Neural Nets in C (compiler is in Go)
- https://github.com/Unstructured-IO Open-Source Pre-Processing Tools for Unstructured Data
- https://news.ycombinator.com/item?id=38487199 - This is a fascinating read about the optimizability of standard machine learning tooling
- https://www.reddit.com/r/LocalLLaMA/comments/1bv3hl4/anythingllm_an_opensource_allinone_ai_desktop_app/
- https://coral.ai/products/accelerator/ - A USB accessory that brings accelerated ML inferencing to existing systems.
- https://github.com/exo-explore/exo - Run your own AI cluster at home with everyday devices. Maintained by exo labs.
RAG
- https://github.com/explodinggradients/ragas
- https://news.ycombinator.com/item?id=39780114
- https://www.reddit.com/r/LocalLLaMA/comments/1cqolrb/fully_local_rag_with_llama3/
- https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B
Audio / Speech
- https://github.com/coqui-ai/TTS - text to speech (see https://news.ycombinator.com/item?id=26790951 )
- https://sites.research.google/usm/ - Universal Speech Model - Towards Automatic Speech Recognition for All
- https://huggingface.co/alvanlii/whisper-small-cantonese/tree/main - alvanlii/whisper-small-cantonese at main
- https://news.ycombinator.com/item?id=38487359 The Seamless Communication models - SeamlessExpressive: A model that aims to preserve expression and intricacies of speech across languages. SeamlessStreaming: A model that can deliver speech and text translations with around two seconds of latency. SeamlessM4T v2: A foundational multilingual and multitask model that allows people to communicate effortlessly through speech and text. Seamless: A model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one.
- https://arxiv.org/abs/2402.04825 https://stability-ai.github.io/stable-audio-demo/
- https://tincans.ai/slm3 - world first joint speech-language model
- https://github.com/collabora/WhisperSpeech - An Open Source text-to-speech system built by inverting Whisper. Previously known as spear-tts-pytorch.
- https://github.com/suno-ai/bark
- https://huggingface.co/suno/bark
- https://github.com/mustafaaljadery/lightning-whisper-mlx
Images / Videos
- https://www.chenyang.co/diffusion.html - Diffusion models from scratch, from a new theoretical perspective
- https://github.com/nerdyrodent/VQGAN-CLIP Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
- https://github.com/nerdyrodent/CLIP-Guided-Diffusion/ Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.
- https://github.com/mehdidc/feed_forward_vqgan_clip Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt.
- https://github.com/openai/glide-text2im - GLIDE: a diffusion-based text-conditional image synthesis model
- https://github.com/alembics/disco-diffusion - GitHub - alembics/disco-diffusion
- https://www.theverge.com/2023/4/6/23672862/microsoft-image-creator-edge-sidebar-dall-e-ai-generator - Microsoft’s rolling out Edge’s AI image generator to everyone - The Verge
- https://llava-vl.github.io/ - LLaVA
- https://news.ycombinator.com/item?id=38487112 Easy Stable Diffusion XL in your device, offline
- https://stability.ai/news/stable-video-diffusion-open-ai-video-model - Introducing Stable Video Diffusion — Stability AI
- https://huggingface.co/liuhaotian/llava-v1.6-vicuna-13b
Text, LLM, etc.
- ☆ https://huggingface.co/numind/NuExtract - fine-tuned on a private high-quality synthetic dataset for information extraction. To use the model, provide an input text (less than 2000 tokens) and a JSON template describing the information you need to extract.
- ☆ https://www.reddit.com/r/LocalLLaMA/comments/1dtt32y/new_collection_of_llama_mistral_phi_qwen_and/
- ☆ https://arxiv.org/abs/2403.04652 - Yi: Open Foundation Models by 01.AI
- Hey everyone, the YI paper has been published and it’s a gem of information on how to train and finetune strong models. In the era of most models refusing to publish any meaningful information, these paper delves into lots of details of how the data was collated, filtered, the data mix etc and how the SFT data was processed. – https://www.reddit.com/r/LocalLLaMA/comments/1b9kq9v/01ai_paper_is_a_gem_for_model_trainers/
-
☆ https://docs.google.com/spreadsheets/u/0/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4/htmlview?pli=1 - A manually curated list of LLMs
- ☆ https://lightning.ai/pages/community/lora-insights/
- ☆ https://twitter.com/rasbt/status/1712816975083155496 - I ran hundreds if not thousands of LoRA & QLoRA experiments to finetune open-source LLMs, and here’s what I learned:
- ☆ https://github.com/OpenAccess-AI-Collective/axolotl
- ☆ https://unsloth.ai/blog/gemma-bugs
- https://github.com/facebookresearch/llama - Inference code for LLaMA models
- https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ - Introducing LLaMA: A foundational, 65-billion-parameter language model
- https://github.com/ggerganov/llama.cpp - Port of Facebook’s LLaMA model in C/C++
- https://github.com/remixer-dec/llama-mps - Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2
- ☆ https://arxiv.org/abs/2404.15758 - Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models
- ☆ https://arxiv.org/abs/2403.06634 - Stealing Part of a Production Language Model “We also recover the exact hidden dimension size of the gpt-3.5-turbo model”
- ☆ https://github.com/Aider-AI/aider - Aider lets you pair program with LLMs, to edit code in your local git repository.
- https://open-assistant.io/ - Apparently developing a foundation model and training data
- https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B - togethercomputer/GPT-NeoXT-Chat-Base-20B · Hugging Face
- https://github.com/facebookresearch/metaseq https://huggingface.co/facebook - Some other lesser-known Facebook models pre-lamma
- https://huggingface.co/bigscience/bloom BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn’t been explicitly trained for, by casting them as text generation tasks.
- https://github.com/nomic-ai/gpt4all - gpt4all: open-source LLM chatbots that you can run anywhere
- https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/ - Cerebras-GPT: A Family of Open, Compute-efficient, Large Language Models - Cerebras
- https://arxiv.org/abs/2304.03208 Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
- https://huggingface.co/cerebras - cerebras (Cerebras)
- https://github.com/NouamaneTazi/bloomz.cpp - proof of concept iOS app with a small model
- https://arxiv.org/abs/2303.17564 - BloombergGPT
- https://vicuna.lmsys.org/ - Another claimed better model based on LLaMA - model not available yet but presumably might be if they figure out licensing?
- https://github.com/manyoso/haltt4llm - Hallucination Trivia Test for Large Language Models
- https://llamahub.ai/ - Connect custom data sources to your LLM with one or more of these loaders (via LlamaIndex or LangChain)
- https://github.com/huggingface/transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
- https://github.com/lm-sys/FastChat/#vicuna-weights - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
-
https://aws.amazon.com/blogs/aws/amazon-codewhisperer-free-for-individual-use-is-now-generally-available/ - Amazon CodeWhisperer, Free for Individual Use, is Now Generally Available AWS News Blog - https://www.mosaicml.com/blog/mpt-7b - Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs
- https://huggingface.co/tiiuae/falcon-40b-instruct - Apache Licensed Model
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard - Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4
- https://github.com/mlc-ai/mlc-llm - Our mission is to enable everyone to develop, optimize and deploy AI models natively on everyone’s devices.
- https://www.reddit.com/r/LocalLLaMA/ - Reddit - Dive into anything
-
https://arstechnica.com/information-technology/2023/03/anthropic-introduces-claude-a-more-steerable-ai-competitor-to-chatgpt/ - Anthropic introduces Claude, a “more steerable” AI competitor to ChatGPT Ars Technica - https://github.com/imaurer/awesome-decentralized-llm - Collection of LLM resources that can be used to build products you can “own” or to perform reproducible research.
- https://bair.berkeley.edu/blog/2023/04/03/koala/ - Koala: A Dialogue Model for Academic Research
- https://github.com/microsoft/DeepSpeed - DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
- https://github.com/ayaka14732/TransCan - English-to-Cantonese translation model
- https://github.com/facebookresearch/StarSpace (2019) StarSpace is a general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems
- https://www.reddit.com/r/bing/comments/11bd91j/release_of_the_whole_initial_prompt_of_bing_chat/ - Reddit - Dive into anything
- https://old.reddit.com/r/ChatGPT/comments/12o29gl/gpt4_week_4_the_rise_of_agents_and_the_beginning/ - GPT-4 Week 4. The rise of Agents and the beginning of the Simulation era : ChatGPT
- https://github.com/togethercomputer/RedPajama-Data RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens
- https://github.com/ymcui/Chinese-LLaMA-Alpaca - 中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
- https://github.com/FranxYao/chain-of-thought-hub - Measuring LLMs’ Reasoning Performance
- https://github.com/bigscience-workshop/petals - Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading. Generate text using distributed 176B-parameter BLOOM or BLOOMZ and fine-tune them for your own tasks
- https://github.com/facebookresearch/fairseq - Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers
- https://github.com/openlm-research/open_llama - Open Reproduction of LLaMA
- https://github.com/FMInference/FlexGen - FlexGen: High-throughput Generative Inference of Large Language Models with a Single GPU
- https://github.com/tloen/alpaca-lora - Fine Tuning LLAMA (Alpaca reproduction)
- https://arxiv.org/abs/2303.12712 - Sparks of Artificial General Intelligence: Early experiments with GPT-4
- https://old.reddit.com/r/MachineLearning/comments/11vfbo9/p_we_gave_gpt35_tools_that_developers_use_and_let/ - [P] We gave GPT-3.5 tools that developers use and let it use them in a sandboxed cloud environment (Demo) : MachineLearning - [P] We gave GPT-3.5 tools that developers use and let it use them in a sandboxed cloud environment (Demo) : MachineLearning
- https://old.reddit.com/r/LocalLLaMA/wiki/models - reddit.com: forbidden (reddit.com)
- https://godmode.space/ - Godmode is a web platform to access the powers of autoGPT and babyAGI.
- https://www.cognosys.ai/ - Another AutoGPT interface
- https://agentgpt.reworkd.ai/ - Another AutoGPT interface
-
https://www.youtube.com/watch?v=3VDpguo4R-0 AI in Google Sheets is Here Get Sample Data and MORE! - Review of Google Sheets Duet - https://twitter.com/karpathy/status/1697317416978755881 - Speculative execution for LLMs is an excellent inference-time optimization.
- https://arxiv.org/abs/2310.10631 - Llemma: An Open Language Model For Mathematics
- https://news.ycombinator.com/item?id=37879077 - ChatGPT’s system prompts
- https://huggingface.co/datasets/roneneldan/TinyStories - roneneldan/TinyStories · Datasets at Hugging Face
- https://github.com/OpenBMB/ChatDev - Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
- https://arxiv.org/abs/2310.06694 https://xiamengzhou.github.io/sheared-llama/ - pruning as a constrained optimization problem where we learn pruning masks to search for a subnetwork matching a pre-specified target architecture while maximizing performance
-
https://blog.gopenai.com/how-to-speed-up-llms-and-use-100k-context-window-all-tricks-in-one-place-ffd40577b4c - The Secret Sauce behind 100K context window in LLMs: all tricks in one place by Galina Alperovich GoPenAI - https://arxiv.org/pdf/2307.06435.pdf A Comprehensive Overview of Large Language Models (5 Oct 2023)
- https://arxiv.org/abs/2305.07759 TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
- https://github.com/artidoro/qlora - QLoRA: Efficient Finetuning of Quantized LLMs
- https://news.ycombinator.com/item?id=38338635 Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)
- https://news.ycombinator.com/item?id=38364084 Exponentially faster language modelling
- https://github.com/karpathy/llama2.c - Reference C implementation of LLAMA
- https://github.com/wangyi-fudan/wyGPT - “This is my 2.5 years’ day-and-night efforts on GPT. It is mature and highly optimized on single GPU.”
- https://github.com/mit-han-lab/TinyChatEngine - AWQ
- https://github.com/abacaj/fine-tune-mistral - (Sample code for fine tuning mistral)
- https://twitter.com/abacaj/status/1709647568081240311
- https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html - You can now train a 70b language model at home We’re releasing an open source system, based on FSDP and QLoRA, that can train a 70b model on two 24GB GPUs.
- https://arxiv.org/abs/2402.13753 - LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
- https://www.reddit.com/r/LocalLLaMA/comments/1b5uv86/perplexity_is_not_a_good_measurement_of_how_well/
- https://arxiv.org/abs/2403.07691 - ORPO: Monolithic Preference Optimization without Reference Model
- https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
- https://research.myshell.ai/jetmoe - trained with less than $0.1 million cost but outperforms LLaMA2-7B
- https://barryzhang.substack.com/p/our-humble-attempt-at-fine-tuning
- https://blog.allenai.org/olmo-open-language-model-87ccfc95f580
- https://www.jasonwei.net/blog/emergence - 137 emergent abilities of large language models
- https://github.com/intel-analytics/ipex-llm?tab=readme-ov-file - This page is probably a nice list of important locally-runnable models. Ignore the Intel thing.
- https://research.myshell.ai/jetmoe - JetMoE-8B is trained with less than $ 0.1 million cost but outperforms LLaMA2-7B
- https://qwenlm.github.io/blog/qwen-moe/
- https://github.com/deepseek-ai/DeepSeek-VL?tab=readme-ov-file
- https://arxiv.org/abs/2406.02528 - Scalable MatMul-free Language Modeling
- “No language left behind” (NLLB-200) translation model
- https://www.nature.com/articles/s41586-024-07335-x
- https://github.com/facebookresearch/fairseq/
- https://ai.meta.com/research/no-language-left-behind/
- https://github.com/unslothai/unsloth - good reviews
- https://www.reddit.com/r/LocalLLaMA/comments/1eqec8v/an_extensive_open_source_collection_of_rag/ - RAG discussions
- https://github.com/microsoft/BitNet (1-bit llm)
- https://www.reddit.com/r/LocalLLaMA/comments/1ggmsmo/smollm2_the_new_best_small_models_for_ondevice/ - Small models
- https://www.reddit.com/r/LocalLLaMA/comments/1g03rdn/hidden_gem_happzy2633qwen257binsv3_is_an/ - Early reasoning model that came before DeepSeek-R1
Toys
- http://www.ioccc.org/2019/mills/hint.html - RNN Machine Learning in C
- https://github.com/blobcity/autoai https://news.ycombinator.com/item?id=29198819 - A framework to find the best performing AI/ML model for any AI problem.
- https://sumplete.com/about/ - ChatGPT written game
- https://github.com/xenova/transformers.js - Run 🤗 Transformers in your browser!
- https://github.com/karpathy/nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPTs.
- https://github.com/karpathy/llm.c
- https://github.com/regrettable-username/llm.metal
- https://github.com/nat/openplayground - An LLM playground you can run on your laptop.
- https://www.ermine.ai/ - Speech recognition in browser
- https://github.com/Torantulino/Auto-GPT - AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
- https://github.com/jjuliano/aifiles - A CLI that organize and manage your files using AI
- https://github.com/eliasdorneles/upiano - A Piano in your terminal
- https://dt.plumbing/ - dt: duck tape for your unix pipes
- https://github.com/susam/fxyt - FXYT is a tiny canvas colouring language that consists of 36 simple stack-based commands
- http://fractaljourney.blogspot.com/
Interesting AI Use Cases And Applications
- https://www.reddit.com/r/LocalLLaMA/comments/1gjfajq/i_got_laid_off_so_i_have_to_start_applying_to_as/ – Reads the CV and automatically tries to answer questions in forms.
Data and Datasets
- ☆ https://github.com/awesomedata/awesome-public-datasets
- https://gluebenchmark.com/tasks - Language ability tests for LLMs
- https://analytics.usa.gov/ provide a window into how people are interacting with the government online
- https://huggingface.co/datasets/databricks/databricks-dolly-15k databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.
Crazy Flexes
- https://bellard.org/ - Fabrice Bellard’s Home Page
- https://twitter.com/David3141593/status/1573218394358386688 The image in this tweet displays its own MD5 hash. You can download and hash it yourself, and it should still match - 1337e2ef42b9bee8de06a4d223a51337 I think this is the first PNG/MD5 hashquine.
- https://browsix.org/ UNIX IN YOUR BROWSER TAB (implement POSIX in the browser)
- https://wavvy.app/ - WASM port of Audacity
- https://fathy.fr/carbonyl Forking Chrome to render in a terminal
-
https://obie.medium.com/my-kids-and-i-just-played-d-d-with-chatgpt4-as-the-dm-43258e72b2c6 - My kids and I just played D&D with ChatGPT4 as the DM by Obie Fernandez Medium - https://old.reddit.com/r/MachineLearning/comments/11vfbo9/p_we_gave_gpt35_tools_that_developers_use_and_let/ - [P] We gave GPT-3.5 tools that developers use and let it use them in a sandboxed cloud environment (Demo) : MachineLearning - [P] We gave GPT-3.5 tools that developers use and let it use them in a sandboxed cloud environment (Demo) : MachineLearning
- https://thume.ca/2020/04/18/telefork-forking-a-process-onto-a-different-computer/ - Teleforking a process onto a different computer! - Tristan Hume
-
https://szopa.medium.com/teaching-chatgpt-to-speak-my-sons-invented-language-9d109c0a0f05 - Teaching ChatGPT to Speak my Son’s Invented Language by Ryszard Szopa Medium - Teaching ChatGPT to Speak my Son’s Invented Language by Ryszard Szopa Medium - https://twitter.com/LinusEkenstam/status/1663566567311900672 - Generative Fill of memes
- http://kylehalladay.com/blog/2020/05/20/Rendering-With-Notepad.html - Kyle Halladay - Ray Tracing In Notepad.exe At 30 FPS
-
https://eieio.games/nonsense/game-11-flappy-bird-finder/ - Flappy Dird: Flappy Bird implemented in MacOS Finder eieio.games -
https://ffmpeg.app/ https://ffmpegwasm.netlify.app/ - https://www.youtube.com/watch?v=DcYLT37ImBY - Training AI to Play Pokemon with Reinforcement Learning
- https://github.com/kparc/bcc/blob/master/d/sidenotes.md#style
- https://github.com/kelas/ooj
- https://github.com/kparc/pf
- https://www.shadertoy.com/user/iq
- https://www.youtube.com/playlist?list=PL0EpikNmjs2CYUMePMGh3IjjP4tQlYqji
- https://news.ycombinator.com/item?id=39488668
- https://news.ycombinator.com/item?id=39799755
- https://oimo.io/works/life/
- https://gamengen.github.io/ GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU
- https://replicate.com/blog/flux-is-fast-and-open-source - Super fast image generation from prompt
Experimental Languages
- ☆ https://github.com/rui314/minilisp
- https://vlang.io/ - simple Go-like language that compiles down to C
- https://haxe.org/ Haxe is an open source high-level strictly-typed programming language with a fast optimizing cross-compiler. Haxe can build cross-platform applications targeting JavaScript, C++, C#, Java, JVM, Python, Lua, PHP, Flash, and allows access to each platform’s native capabilities. Haxe has its own VMs (HashLink and NekoVM) but can also run in interpreted mode.
- https://github.com/racket/zuo Zuo: A Tiny Racket for Scripting
- https://github.com/JeffBezanson/femtolisp Small LISP used as base of Julia parser
- https://github.com/cesanta/v7 (Small Javascript engine)
- https://github.com/cesanta/mjs (Small Javascript engine)
- https://github.com/cesanta/elk (Embeded Javascript engine)
- https://github.com/rochacbruno/py2rs - From Python into Rust
- https://roc-lang.org - The Roc Programming Language
- https://wingolog.org/archives/2013/01/07/an-opinionated-guide-to-scheme-implementations - an opinionated guide to scheme implementations — wingolog
Software Engineering
- https://thorstenball.com/blog/2022/05/17/professional-programming-the-first-10-years/ - Thorsten Ball - Professional Programming: The First 10 Years
-
https://staffeng.com/ - Stories of reaching Staff-plus engineering roles - StaffEng StaffEng - https://chriskiehl.com/article/thoughts-after-6-years Software development topics I’ve changed my mind on after 6 years in the industry
- https://github.com/jorgef/engineeringladders/blob/master/TechLead-EngineeringManager.md
- https://www.slideshare.net/reed2001/culture-1798664/ Netflix Culture Document (and lots of management philosophies)
- https://www.hillelwayne.com/post/are-we-really-engineers/ - Are We Really Engineers?
-
https://www.remotemobprogramming.org/ - Remote Mob Programming How we do Remote Mob Programming. - https://medium.com/@pravse/the-maze-is-in-the-mouse-980c57cfd61a - An interesting critique of Google’s internal culture
-
https://www.qword.net/2023/04/30/maybe-you-should-store-passwords-in-plaintext - Maybe you should store passwords in plaintext. QWORD - https://github.com/kuchin/awesome-cto - A curated and opinionated list of resources for Chief Technology Officers, with the emphasis on startups
- https://www.businessinsider.com/middle-managers-great-flattening-organization-meta-tech-layoffs-firing-2023-3 - Companies Are Laying Off Middle Managers, That’s a Huge Mistake
- https://github.com/ZachGoldberg/Startup-CTO-Handbook/blob/main/StartupCTOHandbook.md
Potentially Useful Apps
- http://avidemux.sourceforge.net/ - Video editing programs
- https://github.com/mifi/lossless-cut - Video editing programs
- https://github.com/SpaceVim/SpaceVim - A community-driven modular vim/neovim distribution - The ultimate vimrc
- https://github.com/schappim/macOCR mac app OCR screenshots
-
https://bonsaibrowser.com/ - Bonsai Web Browser for Research - https://httptoolkit.tech/ HTTP Toolkit is a beautiful & open-source tool for debugging, testing and building with HTTP(S) on Windows, Linux & Mac.
- https://lav.io/notes/videogrep-tutorial/ - Videogrep Tutorial
- https://tuple.app/ The best remote pair programming app on macOS
- https://sqliteviewer.app/ https://github.com/qwtel/sqlite-viewer-vscode
- https://shortcat.app/ - Manipulate macOS masterfully, minus the mouse.
Business Tools
- ☆ https://www.metaculus.com/home/ - https://www.metaculus.com/questions/ - Predictive Bets / Predictive Markets
- ☆ https://news.ycombinator.com/item?id=39012544 - Where can I find good legal documents?
- https://shipfa.st/tools/logo-fast - lazy way to make nice logos
- http://sumome.com/ - Sumo. FREE email capture tool (Newsletter)
- https://hootsuite.com/ - Social Media Management startup company
- https://projectshield.withgoogle.com/en/ - DDoS/security help from Google
- https://www.pythonanywhere.com/ - Python Anywhere - Host, run, and code Python in the cloud!
- https://www.digitalocean.com/pricing/ - Hosting
- https://www.peer5.com/ - Serverless CDN
- https://www.tarsnap.com/ - Cheap and efficient (paid) backup system (by a single developer… author of bsdiff)
- https://www.deepl.com/en/translator - better than google translate, allegedly
- https://protonmail.com/pricing - Secure mail
- https://render.com - Cheap and seems robust hosting service
- https://pdfreal.com/ The Internet’s fastest PDF tools. Secure and anonymous. Free of charge.
- https://playingcards.io/ - Meet and play online from anywhere in the world
- https://www.media.net/ - English country based Advertising network alternative to Adsense.
- TransferWise (or Wise) - Fintech easy wire transfer and currency exchange. local bank accounts for receiving moneys.
- Payoneer - multiple ways to get paid online by international clients. making international payments, receiving funds, managing your digital business, or accessing capital
- https://www.snigel.com/blog/top-5-adsense-alternatives-that-can-increase-website-revenue/ - Review of Adsense alternatives by an adsense competitor
- https://browserflow.app/ Automate your work on any website
- https://www.pullrequest.com/pricing/ Code Review as a Service
- HN discussion: https://news.ycombinator.com/item?id=29623505
-
https://beebom.com/best-alexa-com-alternatives/ - 7 Best Alexa.com Alternatives for Website Ranking (2021) Beebom - https://www.re3data.org/ Search for data repositories
- http://www.fmwconcepts.com/imagemagick/index.php - A collection of image utilities that uses imagemagick
- https://panelbear.com/ Alternative to Google Analytics
- https://www.rescuetime.com/ - Track time used on Computer
- http://ankisrs.net/ - Flash cards for remembering things
- https://retool.com/ Build internal tools, remarkably fast (seems like a modern VB6)
- https://www.craiyon.com/ DALLE-alike model free for use supported by ads
- https://www.midjourney.com/ image generation model use in discord subscription based
- https://www.reddit.com/user/Wiskkey/comments/p2j673/list_part_created_on_august_11_2021/ This list is a part of master post List of sites/programs/projects that use OpenAI’s CLIP neural network for steering image/video creation to match a text description. All of the items in this post use VQGAN as an image generator; some items may also use another image generator
- https://neuralblender.com/ (Another VQGAN+CLIP image generator - quite fast!)
- http://commoncrawl.org/ - We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone.
- https://www.thisworddoesnotexist.com/ (probably a very good tool to find real-sounding words)
- https://roadmap.sh/ - educational content to help guide the developers in picking up the path and guide their learnings
- https://invidious.snopyta.org (alternative Youtube frontend)
- https://flippa.com/ #1 marketplace to buy and sell (websites)
- https://exercism.org/ Develop fluency in 66 programming languages with our unique blend of learning, practice and mentoring.
- https://www.prowebtips.com/best-adsense-alternatives-for-maximize-earnings/#Why_should_you_use_Adsense_alternative - 10 Best AdSense Alternatives for Maximize Earnings in 2021
-
https://www.newsminimalist.com/about - About News Minimalist - https://gptforwork.com/ - Use ChatGPT in Google Sheets and Docs
- redmine, trello, asana - Project Management Tools
- https://news.ycombinator.com/item?id=38276515 Is Delaware the cheapest place to incorporate?
- https://news.ycombinator.com/item?id=39604961 RAG on codebases that actually works
- (1) Instead of directly embedding code, we parse the AST of the codebase, recursively generate docstrings for each node in the tree, and then embed the docstrings. (2) Alongside vector similarity search and keyword search, we do “agentic search” where an agent reviews the relevance of the search results, and scans the source code to follow references that might lead to something important. Then it returns the relevant sources.
- https://observablehq.com/product - Observable offers a modern way to create and host powerful, performant data apps. Use Markdown, JavaScript and SQL, Python, R, or any other language you choose.
Graphics, Design, etc.
- https://dribbble.com - Discover the world’s top designers & creatives Dribbble is the leading destination to find & showcase creative work and home to the world’s best design professionals.
- http://my-free-vector-art.com/ - Free vector art
- http://www.makeuseof.com/tag/free-fonts-sites-where-find-them/ - The 11 Best Free Font Websites for Free Fonts Online
- https://www.mattcrampton.com/blog/mega_list_of_free_image_sites_for_blogging/ - The Mega List Of Free Image Sites For Blogging - MattCrampton.com
- https://iconduck.com/ 118,894 free open source icons & illustrations
- https://tailwindcomponents.com/ A free repository for community components using Tailwind CSS Open source Tailwind UI components and templates to bootstrap your new apps, projects or landing sites!
- https://blog.roastmylandingpage.com/landing-page-roasts/ - What I learnt roasting 200 landing pages in 12 months 🍗 👀
HOWTOs, Beginners instructions, Learn stuff
- ☆ https://p403n1x87.github.io/running-c-unit-tests-with-pytest.html https://news.ycombinator.com/item?id=30301880
- http://gitready.com/ - Git learning site
- http://learnvimscriptthehardway.stevelosh.com/ - Learn Vimscript the Hard Way
- https://www.youtube.com/watch?v=pTr1uLQTJNE - Simon Willison - Instant serverless APIs, powered by SQLite (comment: “Little data” using datasette (and other sqlite-focused tools))
- https://www.youtube.com/watch?v=w2nKIGhXPAM - PyCon 2019 Jess Shapiro - Everything at Once: Python’s Many Concurrency Models - PyCon 2019
-
https://towardsdatascience.com/python-tools-for-a-beginner-data-scientist-39b3b9a4303a - Python Tools for a Beginner Data Scientist by Rishi Sidhu Towards Data Science - https://github.com/a327ex/blog/issues/30 - This tutorial series will cover the creation of a complete game with Lua and LÖVE
- https://www.youtube.com/c/PrimerLearning/videos - Simulating evolution
- https://www.pagetable.com/?p=764 Using the OS X 10.10 Hypervisor Framework: A Simple DOS Emulator
- https://www.youtube.com/watch?v=7aONIVSXiJ8 Introduction to Memory Management in Linux (Very informative quite useful info for systems programming)
- https://beautifulracket.com/appendix/why-lop-why-racket.html - Beautiful Racket: Why language-oriented programming? Why Racket?
- https://github.com/fastai/course-nlp - A Code-First Introduction to NLP course
- https://aphyr.com/posts/265-getting-started-in-software - Getting started in software
- https://www.youtube.com/watch?v=ImLFlLjSveM How C++20 Changes the Way We Write Code - Timur Doumler - CppCon 2020
- https://posthog.com/blog/what-to-ask-in-interviews - The really important job interview questions engineers should ask (but don’t) - PostHog
- https://web.mit.edu/6.001/6.037/sicp.pdf Structure and Interpretation of Computer Programs
- http://infolab.stanford.edu/~ullman/mmds/book.pdf - Book: Mining of Massive Datasets
- https://www.clientside.dev/explore - Practice with projects & problems taken from real interviews. Each with a set of unit tests so you never miss an edge case and solution explanations written by senior engineers to guide you.
- https://github.com/google/comprehensive-rust - a multi-day Rust course / covers all aspects of Rust, from basic syntax to generics and error handling
- https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html - PyTorch Tutorial
- https://viralinstruction.com/posts/hardware/ - What scientists must know about hardware to write fast code (Good for junior software engs as well)
- https://software.rajivprab.com/2018/04/29/myths-programmers-believe-about-cpu-caches/ - Myths Programmers Believe about CPU Caches – Software the Hard way
- https://dwheeler.com/essays/filenames-in-shell.html - Filenames and Pathnames in Shell (bash, dash, ash, ksh, and so on): How to do it Correctly
- https://spencermortensen.com/articles/email-obfuscation/ - Email obfuscation: What works in 2023?
Crypto
- https://news.ycombinator.com/item?id=29366310 Proof of stake is incapable of producing a consensus (yanmaani.github.io)
- https://news.ycombinator.com/item?id=29264374 Ask HN: What are you using for public documentation these days?
- https://www.psl.com/feed-posts/web3-engineer-take - Feed
ProblemSets
-
https://projectlovelace.net/problems/ - Problems Project Lovelace - https://open.kattis.com/ - Kattis, Kattis
- https://adventofcode.com/ - Advent of Code 2023
- https://protohackers.com/ - Implementing network protocols
- https://cryptopals.com/ - cryptography fundamentals
Archive, Stories, Articles, Reports, etc.
- ☆ https://motherduck.com/blog/big-data-is-dead/
- ☆ https://www.mostlypython.com/django-from-first-principles-part-2/ - Single-file Django web app tutorial
- ☆ https://longnow.org/essays/richard-feynman-connection-machine/
- ☆ http://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/ - A super long, informative and (useless for software professionals) article on everything about code and software. Good intro to outsiders if they are willing to read.
- ☆ https://killedbygoogle.com/
- ☆ https://mtlynch.io/solo-developer-year-5/
- http://joyofandroid.com/how-to-downgrade-samsung-galaxy-s3-easily/ - How To Downgrade Samsung Galaxy S3 Easily: We Found The Answer - JoyofAndroid
- https://lists.debian.org/debian-ctte/2013/12/msg00234.html - init system other points, and conclusion
- http://veertu.com - OSX app store native virtualization
- https://github.com/mist64/xhyve - The xhyve hypervisor is a port of bhyve to macOS. It is built on top of Hypervisor.framework in OS X 10.10 Yosemite and higher, runs entirely in userspace, and has no other dependencies
-
http://arstechnica.com/information-technology/2016/03/to-sql-or-nosql-thats-the-database-question/ - To SQL or NoSQL? That’s the database question Ars Technica - http://people.csail.mit.edu/mrub/vidmag/ - Video Magnification
- http://stackoverflow.com/research/developer-survey-2016 - Stack Overflow Developer Survey 2016 Results
- https://itunes.apple.com/hk/itunes-u/vintage-mac-video/id421137560?mt=10 - Lots and lots of old videos about the Mac
- http://portingkit.com/game/385 - AOE installation on MacOS (that works, apparently)
- https://nickcraver.com/blog/2017/05/22/https-on-stack-overflow/ - Nick Craver - HTTPS on Stack Overflow: The End of a Long Road
- http://phdcomics.com/comics/archive.php?comicid=1628 - PHD Comics: The Digital Humanities
- https://medium.com/@liwaiyin430/%E9%BB%9E%E8%A7%A3%E8%A6%81%E5%AD%B8%E5%AF%AB-code-topic-modeling-ab334ef9734c - Machine Learning blogger in Cantonese
-
https://arstechnica.com/science/2019/01/machine-learning-can-offer-new-tools-fresh-insights-for-the-humanities/ - Machine learning can offer new tools, fresh insights for the humanities Ars Technica - http://mattmahoney.net/dc/text.html Large Text Decompression Benchmark
- https://www.youtube.com/watch?v=kW6ZLB9-PLw - I Built a HUGE 336TB Server Without Linus Tech Tips!
- https://www.youtube.com/watch?v=5J2yoKmJUQE The best coding apps for kids
- https://www.youtube.com/watch?v=kkt_BtR9Kzk - rpython - subset of python used by pypy and C-level fast.
- https://news.ycombinator.com/item?id=25113482 - Ok Google: please publish your DKIM secret keys
- http://www.arewewebyet.org/ - Rust for Web.
- https://blog.openai.com/better-language-models/ (GPT2)
- https://www.cna.com.tw/news/firstnews/201903200282.aspx - AI 中文造字
- https://degoogle.jmoore.dev/#browser-extensions - Cutting Google out of your life
- https://github.com/mikelxc/Workarounds-for-ARM-mac - This repository describes how I get most of my configurations work on the new Apple Silicon Mac
-
https://www.youtube.com/watch?v=UHV-oNKK9vs How to make a USB to DC Cable from an old Mouse or Keyboard USB Cable Recycle & Hack - https://alexkrupp.typepad.com/sensemaking/2021/06/django-for-startup-founders-a-better-software-architecture-for-saas-startups-and-consumer-apps.html - Just a moment…
- https://play.elevatorsaga.com/?utm_source=hackernewsletter&utm_medium=email&utm_term=fun - Elevator programming game.
- https://lihkg.com/thread/2457149/page/1 IT討論區(114) 1HBU=50K
- https://blog.immersed.team/working-from-orbit-39bf95a6d385 (VR/AR workspace)
- https://graphite.dev/blog/post/DThX8ffP1gmxWJChEv0y Stacked changes: how Facebook and Google engineers stay unblocked and ship faster
- https://www.npr.org/2010/10/09/130451369/the-zombie-network-beware-free-public-wifi - The Zombie Network: Beware ‘Free Public WiFi’ : NPR
- https://www.youtube.com/watch?v=jmTwlEh8L7g DEF CON 26 - Christopher Domas - GOD MODE UNLOCKED Hardware Backdoors in redacted x86
- https://www.ibiblio.org/harris/500milemail.html (Funny)
- https://blog.sanctum.geek.nz/vim-koans/ (Funny)
- https://stevelosh.com/blog/2013/04/git-koans/ (Funny)
- https://www.destroyallsoftware.com/talks/wat Roasting ruby and js! (Funny)
-
https://klinger.io/posts/fyi-how-founders-can-avoid-drive-by-management - #fyi: How founders can avoid drive-by-management Andreas Klinger - https://news.ycombinator.com/item?id=27136539 very interesting discussion on the implications of the CAP theorem…
-
https://caffeinedev.medium.com/how-to-install-tensorflow-on-m1-mac-8e9b91d93706 - How To Install TensorFlow on M1 Mac (The Easy Way) by Prabhat Kumar Sahu Medium - https://news.ycombinator.com/item?id=33257455 Try disabling slide to type. I don’t use it, so if it’s on, it introduces inaccuracies.
- https://beautifulracket.com/appendix/why-racket-why-lisp.html - Beautiful Racket: Why Racket? Why Lisp?
- https://twitter.com/marcan42/status/1494213855387734019 turns out Apple’s custom NVMe drives are amazingly fast - if you don’t care about data integrity.
- https://amyunger.com/blog/2020/09/10/staff-engineer-at-heroku.html - How I operated as a Staff engineer at Heroku
-
https://tinyprojects.dev/posts/i_spent_two_years_launching_tiny_projects - I Spent 2 years Launching Tiny Projects Tiny Projects -
https://www.stiftung-nv.de/en/publication/understanding-global-chip-shortages - Understanding the global chip shortages Stiftung Neue Verantwortung (SNV) - https://github.com/shiraeeshi/hstetr-first - How a Java Programmer Wrote Console Tetris In Haskell And What The Learning Curve Was Like
- https://dallasinnovates.com/exclusive-qa-john-carmacks-different-path-to-artificial-general-intelligence/ - Exclusive Q&A: John Carmack’s ‘Different Path’ to Artificial General Intelligence » Dallas Innovates
- https://interviewing.io/guides/system-design-interview - System Design Interview Guide for Senior Engineers
- It says it’s a guide to system design interview, but the actual content seems to be much more than that…
- https://nymag.com/intelligencer/2023/02/the-silicon-valley-loop-malcolm-harriss-palo-alto.html - The Silicon Valley Loop, Malcolm Harris’s Palo Alto
- https://mitchellh.com/writing/my-startup-banking-story - My Startup Banking Story – Mitchell Hashimoto
- ☆ https://www.joelonsoftware.com/2004/06/13/how-microsoft-lost-the-api-war/
-
https://www.vanityfair.com/news/business/2014/11/satya-nadella-bill-gates-steve-ballmer-microsoft - Can C.E.O. Satya Nadella Save Microsoft? Vanity Fair - https://500mile.email/ - 500 Mile Email • Absurd Bug Stories
-
https://medium.com/@laurajavier/google-slides-is-actually-hilarious-83c1ced857ee - Google Slides is Actually Hilarious by Laura Javier Medium - https://migeel.sk/blog/2024/01/02/building-a-self-contained-game-in-csharp-under-2-kilobytes/
- https://papereditor.app/dev - 9 years of Apple text editor solo dev
- https://loglog.games/blog/leaving-rust-gamedev/
Python
- https://github.com/facebookincubator/cinder - Cinder is Instagram’s internal performance-oriented production version of CPython 3.8
- numba - A High Performance Python Compiler
- numpy - The fundamental package for scientific computing with Python
- cython - The Cython language is a superset of the Python language that additionally supports calling C functions and declaring C types on variables and class attributes
- pyston - Pyston is a fork of CPython 3.8.8 with additional optimizations for performance
- pypy
- PyTorch - is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing
- https://github.com/python/mypy - static typing compile checker
- https://blog.kevmod.com/2017/02/personal-thoughts-about-pystons-outcome/ - Personal thoughts about Pyston’s outcome – kmod’s blog
- https://github.com/tonybaloney/Pyjion - Pyjion, a JIT extension for CPython that compiles your Python code into native CIL and executes it using the .NET CLR.
- https://pyre-check.org/ - type checker
- https://github.com/jwilk/python-syntax-errors The idea is to put such a statement near the top of your file. If a user inadvertently ran the code against an older version, they would get a fairly helpful error message:
- https://github.com/satwikkansal/wtfpython Exploring and understanding Python through surprising snippets. (weird language quirks etc.)
- https://github.com/google/python-fire - Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
- https://github.com/chriskiehl Gooey: Turn almost any Python command line program into a GUI application https://news.ycombinator.com/item?id=27490291
- https://colab.research.google.com/github/philzook58/z3_tutorial/blob/master/Z3%20Tutorial.ipynb - z3 very nice python lib for theorem proving and solving algebra stuff
- https://github.com/zanellia/prometeo – a Python-to-C transpiler for high-performance computing
- https://ed25519.cr.yp.to/python/ed25519.py Python implementation of ED25519
- https://github.com/tandav/pipe21 - Simple functional pipes
- https://pip.wtf/ - Inline dependencies for small Python scripts.
- https://github.com/xonsh/xonsh - https://news.ycombinator.com/item?id=39368586 - xonsh/xonsh: :shell: Python-powered, cross-platform, Unix-gazing shell.
- https://github.com/Maratyszcza/PeachPy - Portable Efficient Assembly Code-generator in Higher-level Python (PeachPy)
Rust
- ☆ https://github.com/rust-unofficial
- https://github.com/rust-unofficial/awesome-rust - A curated list of Rust code and resources.
- https://github.com/rust-unofficial/rust-practise-questions
- https://github.com/rust-unofficial/patterns
- https://github.com/rust-lang/miri - An experimental interpreter for Rust’s mid-level intermediate representation (MIR). It can run binaries and test suites of cargo projects and detect certain classes of undefined behavior
- https://github.com/diesel-rs/diesel - A safe, extensible ORM and Query Builder for Rust
- https://github.com/johnthagen/min-sized-rust - Minimizing Rust Binary Size - This repository demonstrates how to minimize the size of a Rust binary.