synesis

Bloomberg trained for the Paris marathon with ChatGPT. The model’s own closing line: the future is people who learn to build disciplined working relationships with AI.

Apr 12, 2026

Why LLMs Still Stumble Over Time

LLMs

reasoning

temporal reasoning

commonsense

evaluation

research

Temporal reasoning in LLMs isn’t one skill but a bundle of them — and even when the calendar math is right, it stays brittle.

Apr 11, 2026

iPhone, Artemis II, Moon

Apple

space

links

iPhone

Artemis II

moon

history

Astronauts onboard Artemis II took pictures with the iPhone 17 Pro Max.

Apr 6, 2026

Filesystems vs. RAG

AI engineering

agentic systems

RAG

LLMs

generative AI

links

Replacing a docs RAG flow with a virtual filesystem reframes the question from “what’s relevant?” to “how does the model investigate?”

Apr 4, 2026

The Revenge of the Data Scientist

AI engineering

agentic systems

LLMs

generative AI

hallucination

evaluation

links

Hamel Husain on the surrounding machinery — logs, metrics, traces, tests, specs — that turns model calls into a working agentic system.

Apr 2, 2026

Artemis II Launches on Apple’s 50th Anniversary

Apple

space

history

Artemis II

moon

Apollo

links

The first crewed lunar journey since Apollo 17 in 1972 — coincident, by accident, with Apple turning fifty.

Apr 2, 2026

Learning to Reason in 13 Parameters

LLMs

reasoning

generative AI

fine-tuning

links

TinyLoRA: an 8B Qwen2.5 reaches 91% on GSM8K with only 13 trained bf16 parameters — 26 bytes of learned weights.

Mar 31, 2026

How Apple Became Apple

Apple

history

links

Fast Company’s oral history of Apple’s earliest days, told by the people who lived it — published as Apple turns 50.

Mar 30, 2026

Redmond City Marathon: #30, with a Sprained Ankle

running

marathon

personal

race

Marathon #30 at the Redmond City Marathon — first race paced by HR. A sprained ankle at mile 2.6 didn’t stop the 7-week streak.

Mar 29, 2026

LLM Neuroanatomy: Topping the Leaderboard Without Changing a Weight

LLMs

inference

links

Duplicate a block of middle transformer layers and run them twice at inference. The power of thinking it over twice.

Mar 24, 2026

Snowflake Cortex AI Escapes Sandbox and Executes Malware

agentic systems

security

AI safety

prompt injection

links

An indirect prompt injection that turned a repo README into shell commands run outside the sandbox — and the multi-layer observability gap it exposed.

Mar 19, 2026

Language Puzzles, NACLO, and a Note of Thanks

linguistics

NACLO

computational linguistics

education

personal

links

Scientific American’s piece on the North American Computational Linguistics Open Competition — and a personal thank-you to the people who built the world that drew me into linguistics.

Mar 19, 2026

Journey into Coding with AI [3/4]: Decision-Bound Programming

coding

software engineering

AI engineering

journey series

AI accelerates code generation but moves the bottleneck to interpretation, comparison, and judgment. The next generation of programming tools should be decision support systems.

Mar 15, 2026

Knuth’s Hamiltonian Cycles, Solved by Claude

mathematics

Mathematica

Claude

generative AI

links

Donald Knuth’s Hamiltonian cycle decomposition problem from The Art of Computer Programming was solved by Claude Opus 4.6.

Mar 8, 2026

Woodinville Half: HM #76, the Comeback

running

half marathon

video

personal

race

1:44:20 at the Woodinville Half. Finally feeling back after a rough January — VO₂max and easy pace both recovered.

Mar 8, 2026

iPhone and iPad Approved for Classified NATO Information

Apple

iPhone

iPad

security

links

The first and only consumer devices to meet these international government security standards.

Feb 26, 2026

Marcus vs. Brundage: A Concrete AGI Bet

AGI

predictions

links

What do we actually mean by AGI? Marcus and Brundage put $22,000 on a 10-task definition by end of 2027 — and prediction markets disagree sharply with Brundage.

Feb 23, 2026

The ‘CS Exodus’ as Discipline Evolution

computer science

education

research

history

links

What looks like an exodus from Computer Science is more accurately a shift toward AI-native computing — the latest move in a long lineage that began with Electrical Engineering.

Feb 15, 2026

Anthropic on AI Coding: Scaffold, Not Substitute

coding

software engineering

developer productivity

learning

Anthropic

links

Anthropic’s study finds that delegating to AI hurts comprehension, but using it for conceptual questions and error explanations correlates with better learning.

Jan 30, 2026

Humans Need Rest, Even When Machines Don’t

coding

software engineering

future of work

burnout

links

Ars Technica on burning out with AI coding agents — and the suggestion that knowledge workers may need new protections.

Jan 19, 2026

Inside the M5: As Big as New Jersey

Apple

semiconductor

iPhone

video

links

Think of the state of New Jersey next time you whip out your iPhone.

Jan 19, 2026

Bridle Trails 50K: Hot Cocoa, Sprained Ankle, No Horseshit

running

50K

ultra

trail running

video

personal

race

My 50K race went sideways at loop 4 with an ankle/calf injury, but I held the rule that matters in trail running.

Jan 12, 2026

End of 2025: 2,543 Miles, 342 Days, Six PRs

running

video

personal

End-of-year retrospective: 2,543.3 miles in 2025 across 342 days; PRs at every distance from 5K to 50K.

Dec 31, 2025

Five Takes from Zhengdong Wang’s 2025 Letter

year in review

links

A thoughtful annual letter on where AI actually is, and where it’s heading.

Dec 31, 2025

Parkrun Turkey Trot: 22:09 on Thanksgiving

running

Parkrun

personal

race

A record 270+ runners showed up; not a PR but felt strong the whole way.

Nov 27, 2025

“Skills” Are Not Software Engineering

LLMs

agentic systems

software engineering

AI engineering

Claude

Anthropic

Anthropic’s skills framework is a meaningful step for agent tooling — but skills cannot replace the orchestration, contracts, and correctness guarantees that real software systems need. A hybrid model is what’s next.

Nov 24, 2025

9th 50K Long Run: Redmond Library to the Seattle Office

running

ultra

long run

video

personal

31.12 miles, two McDonald’s pit stops, ended at the office with a hot shower.

Nov 9, 2025

LLMs Can Get Brain Rot Too

LLMs

reasoning

paper

links

Training on viral, low-quality data costs models reasoning ability and long-context understanding — and clean retraining doesn’t fully undo it.

Oct 22, 2025

Karpathy: AGI Is Still a Decade Away

AGI

agentic systems

LLMs

coding

links

Four takeaways from Karpathy’s chat with Dwarkesh — the decade of agents, RL through a straw, and why coding LLMs still aren’t reliable collaborators.

Oct 18, 2025

Snohomish River Run: Marathon #29, Survived from Mile 18

running

marathon

personal

race

4th marathon of the year (29th overall): chip time 3:38:09, perfect weather — until gravel and a calf flare-up at mile 18.

Oct 13, 2025

Vibe Engineering

agentic systems

software engineering

coding

AI engineering

links

Vibe engineering = using AI and coding agents responsibly, as an acceleration tool, while retaining accountability, oversight, and engineering discipline.

Oct 8, 2025

Terence Tao’s ChatGPT Research Buddy

ChatGPT

research

mathematics

links

Terence Tao showed how he used ChatGPT like a research buddy — back-and-forth on a MathOverflow counterexample, hours saved, even a few math slips caught.

Oct 2, 2025

A Pure Republic of Classical Music

Apple

Apple Music

classical music

music

generative AI

links

On Apple Music Classical’s anonymous editors — and why their anonymity is exactly what lets them recommend purely on the merits.

Sep 12, 2025

Journey into Coding with AI [2/4]: Shifting Gears

coding

software engineering

AI engineering

automation

journey series

From stick shift to starship console: how much control do we cede to coding AI, and what role remains for humans?

Sep 6, 2025

Journey into Coding with AI [1/4]: Running Back to Code

coding

software engineering

AI engineering

journey series

A month in with coding AI: it’s like assembling a small team of servant interns. The next abstraction step in programming — concepts compiled into code.

Sep 5, 2025

Embedding Limits: A Linear-Algebra Note (and Kernel Tricks)

LLMs

RAG

embeddings

retrieval

RecSys

paper

research

A follow-up callout: the bound rests on rank(AB) ≤ min(rank(A), rank(B)) — a property of dot-product scoring, not embeddings per se. Plus one more way out: kernel tricks.

Sep 3, 2025

Redmond Harvest Half: HM #64, 1:40:45

running

half marathon

personal

race

1:40:45 at the Redmond Harvest Half — 64th HM since 2022; CoachGPT wasn’t too happy.

Sep 1, 2025

Embeddings Hit a Theoretical Ceiling

LLMs

RAG

embeddings

retrieval

RecSys

agentic systems

paper

links

A new paper proves that single-vector retrieval has a hard dimension-dependent capacity bound — and shows three ways out: cross-encoders, multi-vector, sparse.

Aug 31, 2025

State of AI in Business 2025: Why 95% Get Zero P&L

agentic systems

paper

future of work

links

$30–40B invested, 95% of orgs see no P&L impact — five traits the winning systems share.

Aug 30, 2025

How Apple AirPods Work

Apple

video

links

So much wonderful tech and engineering goes into this teeny tiny daily driver.

Aug 28, 2025

Tunnel Vision Marathon: Heat, Cramps, and a 25 sec/mile Lesson

running

marathon

personal

race

3:40:01 at Tunnel Vision — far from BQ. Then a runner’s-world piece on heat impact reframed everything.

Aug 11, 2025

Apple Shadyside, Old and New

Apple

CMU

history

personal

I still remember September 4, 2004 — standing in line for the opening of the Apple Shadyside store while at CMU. Goodbye, old store. Welcome, new one.

Aug 8, 2025

Mid-Year Running Recap: PRs at Every Distance

running

half marathon

marathon

Parkrun

video

personal

1,697 mi YTD, 283 ahead of plan. PRs at every distance from 5K to half marathon. 11 sec from sub-21 5K, 5 min from BQ.

Aug 3, 2025

MAST: A Failure Taxonomy for Multi-Agent Systems

LLMs

agentic systems

evaluation

paper

links

First empirically grounded taxonomy of why MAS fail — 14 modes across specification, alignment, and verification — and small interventions that move the needle 9–16%.

Aug 2, 2025

MSR: Which Occupations GenAI Is Actually Used In

generative AI

future of work

jobs

paper

links

MSR’s analysis of real-world GenAI usage by occupation — knowledge work and information-providing roles top the list.

Aug 1, 2025

How Anthropic Teams Use Claude Code

coding

software engineering

agentic systems

Anthropic

Claude

generative AI

links

Two patterns from Anthropic’s internal teams: treat Claude like a slot machine (DS/ML) or an iterative partner (Product Eng).

Jul 25, 2025

Chollet: How We Get to AGI

AGI

reasoning

talk

video

links

Notes from François Chollet’s talk: intelligence as process, the ARC suite, and why we need to combine type-1 (continuous) and type-2 (discrete) abstractions.

Jul 7, 2025

‘Positive Review Only’: Hidden AI Prompts in Papers

paper

research

ethics

links

Researchers hiding AI prompts in academic papers to manipulate peer review — a Nikkei Asia investigation. Please don’t do this.

Jul 5, 2025

Apple Music Turns 10

Apple

Apple Music

history

links

10 years ago today, Apple Music went live.

Jun 30, 2025

20 Years of Podcasts on iTunes

Apple

podcasting

history

links

20 years ago today, podcasts went mainstream.

Jun 28, 2025

For Alfred Brendel

music

classical music

Apple Music

personal

On the passing of Alfred Brendel — and one Beethoven Bagatelle that has stayed with me.

Jun 19, 2025

Karpathy: Software Is Changing (Again)

AGI

agentic systems

talk

video

generative AI

links

Notes from Karpathy’s talk: Software 3.0, LLMs as fallible people spirits, the autonomy slider, and the decade of agents.

Jun 19, 2025

ICR² Accepted to ACL 2025 Findings

LLMs

RAG

retrieval

NLP

research

conference

paper

Our paper on Eliciting In-context Retrieval and Reasoning for Long-context LLMs has been accepted into ACL 2025 Findings.

Jun 10, 2025

Mill Town Marathon: 3:24:01, Closer to BQ

running

marathon

personal

race

Marathon PR by ~6 min; 27 minutes off last year’s time on the same course.

Apr 13, 2025

World’s Fastest 10K + Three PRs in Two Weeks

running

10K

half marathon

Parkrun

personal

race

10K PR (43:56), 5K PR (21:23), and a half marathon PR (1:39:07) — all within two weeks.

Mar 23, 2025

Apple Music Classical on the Web

Apple

Apple Music

classical music

music

links

Classical music lovers, rejoice — Apple Music Classical is now available on the web.

Mar 13, 2025

Rancho San Antonio Trail Run, Cupertino

running

trail running

video

personal

9.67 mi, 1,545 ft EG on Rancho San Antonio trails — 3 hours of daylight, well spent.

Mar 2, 2025

DeepSeek vs ChatGPT on Ethical Questions

ChatGPT

DeepSeek

generative AI

ethics

links

ChatGPT seems to follow deontological ethics (rules-based) while DeepSeek aligns with consequentialism (outcome-based). Pre-training or RL-tuning?

Feb 8, 2025

Eliciting In-context Retrieval and Reasoning for Long-context LLMs (Preprint)

LLMs

RAG

retrieval

NLP

research

paper

Our fresh preprint on ICR²: if LLMs had context windows large enough for a whole knowledge base, RAG could collapse into a single step — but are current LCLMs up to the task?

Jan 15, 2025

Bridle Trails Winter 50K: My First Ultra

running

ultra

50K

trail running

personal

race

5:05:42 at the Bridle Trails Winter Running Festival — 9th of 45 runners, in the dark, with a fall on a smooth rock and a knee that nearly ended the race.

Jan 12, 2025

End of 2024: 2,375 Miles, 18 Marathons, 4 50Ks

running

marathon

50K

ultra

year in review

video

personal

End-of-year retrospective: 2,375.4 miles in 2024, 18 full marathons (5 races) and 4 50K runs, with PRs from 5K to 50K and a 34.5-mile max.

Dec 31, 2024

Sutskever at NeurIPS 2024: Pre-Training Era Is Over

generative AI

LLMs

NLP

reasoning

conference

video

links

Ilya Sutskever’s Test of Time award talk at NeurIPS 2024 — pre-training is over, next up is agents/synthetic data/inference-time compute, and reasoning is unpredictable.

Dec 13, 2024

Seattle Marathon 2024: 18th Marathon, GAP PR

running

marathon

half marathon

Parkrun

personal

race

Official 3:33:04 at Seattle Marathon 2024 — 1:49 short of PR, but a hillier course, GAP-faster than Snohomish, and top 9% in age group.

Dec 1, 2024

Languages of New York City

linguistics

links

~700 language varieties from all over the world used in NYC — about 10% of world languages.

Nov 27, 2024

Parkrun PR + 34-Mile Ultra

running

Parkrun

ultra

long run

video

personal

race

Parkrun 5K PR (21:09 official) — 7th PR of the year. Plus my longest run yet: a 34-mile ultramarathon on the Snoqualmie Valley Trail.

Nov 16, 2024

20 Seconds of Thinking, 100,000× More Data

generative AI

LLMs

reasoning

OpenAI

links

Noam Brown at TED AI: a poker bot thinking for 20 seconds matched scaling the model 100,000× and training it 100,000× longer.

Nov 14, 2024

How Well Can Transformers Build World Models?

LLMs

world models

reasoning

transformers

paper

research

Two negative results: SOTA models drop on GSM8K when names/numbers change, and DFA-based metrics show big gaps when the world shifts.

Nov 8, 2024

Snohomish River Run 2024: Marathon PR by 7 Minutes

running

marathon

personal

race

3:31:15 at Snohomish River Run — a 7-minute marathon PR. Slower slow runs, no skipped intervals, better fueling, and a 180-bpm playlist saved the day at mile 14.

Oct 13, 2024

6th Parkrun PR + 2024 Stretch Goals Done

running

Parkrun

marathon

half marathon

personal

race

21:16 at Parkrun 5K — 6th PR of the year. All three 2024 stretch goals checked: sub-7’ pace, 6 PRs, 3 marathon races.

Oct 5, 2024

Around Lake Sammamish: A 27.71-Mile Run

running

marathon

long run

video

personal

13th marathon-length run of 2024, looping Lake Sammamish — a planned 31-mile road run that turned into trail running when Weowna Park happened at mile 12.

Sep 22, 2024

Crossing Lake Washington: A 33.01-Mile Run

running

ultra

50K

long run

video

personal

Second 50K+ run of 2024 and the longest yet — 33.01 miles, 5:21:26 moving time, looping Lake Washington across multiple trails.

Sep 15, 2024

Redmond Harvest Half 2024: 1:45:24 After Flu Recovery

running

half marathon

marathon

personal

race

1:45:24 at Redmond Harvest Half Marathon — 12 minutes faster than last year, after recovering from August’s PR-making marathon and a flu just days earlier.

Sep 2, 2024

Tunnel Vision Marathon 2024: Bathroom, Bad Shoes, OOD

running

marathon

machine learning

personal

race

3:46:13 at Tunnel Vision Marathon — PRs at 10K/15K/HM splits, but a bathroom emergency, slightly large shoes, and untrained quads on a downhill course did the rest.

Aug 11, 2024

Time-Sensitive Knowledge Editing via Efficient Fine-Tuning (ACL 2024)

LLMs

NLP

knowledge editing

fine-tuning

conference

paper

research

PEFT outperforms locate-and-edit for knowledge editing — and applying LoRA to MLP and attention parameters stays robust as the number of edits grows.

Aug 9, 2024

30th Half Marathon: Iron Horse Trail BQ Prep

running

half marathon

marathon

personal

race

30th HM on the Iron Horse Trail — 9 miles up 1,287ft and 8 miles down. Down-hill pace at 7’40” makes BQ feel within reach for the August marathon.

Jul 28, 2024

Tunnel Marathon Course Recon: 53 Miles Over Two Sundays

running

marathon

video

personal

10th marathon since January, run as a recon of the August Tunnel Marathon course — 26.69 miles one weekend, 26.39 the next, both halves of the course in opposite directions.

Jul 14, 2024

First 50K: Carnation to North Bend and Back

running

ultra

50K

long run

video

personal

Longest run yet — 50K/31 miles from Carnation to North Bend and back, 5:18:05 moving time. Plenty of stops for photos and ice cream.

Jun 30, 2024

10K for 7 Days at NAACL 2024 (Mexico City)

running

conference

NAACL

NLP

video

personal

We ran 10K every morning for a week at 2,240 m elevation alongside the conference.

Jun 22, 2024

TWEAK at NAACL 2024: Decoding Without Hallucinations

LLMs

NLP

hallucination

knowledge graphs

conference

paper

research

generative AI

TWEAK ranks decoding candidates by how well their continuations support the input facts — a decoding-only fix that improves faithfulness with minimal quality loss.

Jun 15, 2024

Are Researchers Using LLMs to Write Their Papers?

LLMs

NLP

research

paper

generative AI

Liang et al. analyze 950k papers and find up to 17.5% show signs of LLM usage in Computer Science.

Jun 3, 2024

Drumheller Marathon 2024: 220 Laps to a New PR

running

marathon

personal

race

First-ever marathon at the UW Drumheller race — 220 laps, 3:38:57 chip time (PR), with a left ankle scare at mile 20 saved by switching to cushioned shoes.

Jun 1, 2024

AI on Trial: Legal Models Hallucinate in 1 out of 6 Queries

LLMs

NLP

hallucination

RAG

paper

generative AI

links

RAG-based legal AI tools from LexisNexis and Thomson Reuters each hallucinate more than 17% of the time, despite claims of being hallucination-free.

May 31, 2024

May the Fourth at Parkrun

running

Parkrun

video

personal

Volunteering at Parkrun on Star Wars day.

May 4, 2024

Portugal Trip: Three Runs and a SINFO Talk

running

talk

conference

personal

Visiting Portugal for SINFO — Porto, Aveiro, Coimbra, Sintra, Cascais, Lisbon — with three runs (Porto two bridges, Sintra to Castelo dos Mouros, Lisbon).

Apr 15, 2024

Two K2T Papers Accepted: TWEAK at NAACL, LAGRANGE at LREC-COLING (2024)

LLMs

NLP

knowledge graphs

hallucination

generative AI

conference

paper

research

reasoning

Two recently published papers on knowledge-to-text generation: TWEAK (decoding-time hallucination reduction) and LAGRANGE (cyclic-evaluation-built K2T dataset).

Apr 4, 2024

Easter Sunday on the Snoqualmie Valley Trail

running

long run

video

personal

21.78 miles southbound from Duvall and back. No PR setting — just taking Bus 11 (my legs) on a scenic ride.

Mar 31, 2024

Generative AI Seeped into Research Peer Reviews

LLMs

NLP

research

paper

generative AI

links

10.6% of ICLR 2024 and 16.9% of EMNLP 2023 reviews are significantly modified by AI — and the chain of accountability is fraying.

Mar 27, 2024

First Marathon Race: 3:51:39 with Three Wrong Turns

running

marathon

half marathon

personal

race

First-ever marathon race — official 3:51:39, watch PR 3:44:58 net of three wrong turns. ‘A thousand paper cuts’ on a six-bridge course.

Mar 24, 2024

Jasmin Paris: First Woman to Finish Barkley Marathons

running

ultra

links

Jasmin Paris, 40, scientist and mother of two, becomes the first woman to finish the Barkley Marathons 100-Mile in its 38-year history — 59:58:21.

Mar 22, 2024

St Patrick’s Day at Parkrun

running

Parkrun

personal

race

We had a laugh at Parkrun on St Patrick’s Day.

Mar 16, 2024

Who Bears the Blame for AI in Peer Reviews?

NLP

generative AI

research

ethics

links

Quoting Elen Le Foll on research papers exposing AI-generated peer reviews — should we blame the authors, or the reviewers/publications?

Mar 15, 2024

Hot Chocolate 15K: Top 10% Finish

running

15K

10K

race

personal

1:15:22 at the Hot Chocolate 15K — a 2:29 PR over practice — and 93rd of 958 (top 10% for the first time).

Mar 3, 2024

KnowledgeableLMs Workshop @ ACL 2024 — CFP

NLP

LLMs

knowledge graphs

RAG

research

conference

links

KnowledgeableLMs workshop at ACL 2024 — long and short paper submissions on knowledge in LMs, RAG, knowledge editing, hallucinations. Deadline May 20.

Feb 27, 2024

RAG with Knowledge Graphs: Key Open Questions

LLMs

NLP

RAG

knowledge graphs

generative AI

links

Semih Salihoğlu’s dissection of RAG systems with knowledge graphs, and six directions for future work.

Jan 17, 2024

First-Ever Marathon: 22°F in the Pacific Northwest

running

marathon

video

personal

race

First-ever marathon — 26.24 miles at 22°F, time 4:34:01 (pace 10’26”). 4th attempt to extend range beyond half marathon, finally there.

Jan 14, 2024

First Parkrun of 2024 + Year’s Goals

running

Parkrun

marathon

half marathon

personal

race

23:50 first Parkrun of the year on Saturday, then a 10.31-mile Sammamish River Trail run on Sunday. 2024 goals: 40 mi/wk, 3+ Parkrun PRs, 1+ HM race, 1 full marathon.

Jan 7, 2024

ChatGPT Bombs Test on Diagnosing Kids’ Medical Cases

LLMs

NLP

hallucination

generative AI

links

ChatGPT-4 got the right pediatric diagnosis in just 17 out of 100 cases — 83% error rate.

Jan 4, 2024

Niklaus Wirth (1934–2024)

computer science

history

links

Niklaus Wirth — gave us ALGOL W, Pascal, Modula, Oberon. “In Europe I’m called by name, but in the US I’m called by value.”

Jan 4, 2024

The Neuroscience of Consciousness: Peter Tse on The Gradient

links

video

Rough notes from a fascinating interview — neurons as toilets, mental causation as filtering, and why verbs are neglected in AI.

Dec 26, 2023

Learning from Tragedies: NLP Beyond LLMs

LLMs

NLP

research

paper

evaluation

links

Are we running out of problems to solve? History says no — data, evaluation, reasoning, and interpretability all have plenty of road ahead.

Dec 20, 2023

When Your Customer Rep Is an Expert in Fluid Dynamics

generative AI

ChatGPT

links

ChatGPT moonlighting as a friendly automobile customer rep — and an expert in fluid dynamics.

Dec 16, 2023

A Perfect Circle

running

personal

6.24-mile run, one mile per round — a perfect circle.

Dec 14, 2023

Seattle Half Marathon 2023: 1:56:21 with 984ft Climb

running

half marathon

personal

race

1:56:21 at the Seattle Half — 11 minutes faster than last year, on a 984ft-ascent course, despite an earlier hoverboard incident. Continuous climb from mile 8 to 11 without stopping.

Nov 26, 2023

“Hallucinate”: Cambridge Dictionary’s 2023 Word of the Year

generative AI

LLMs

NLP

hallucination

links

Cambridge Dictionary’s 2023 word of the year: “hallucinate.”

Nov 17, 2023

Large Language Models as Sleuths

LLMs

NLP

security

research

paper

generative AI

links

GPT-4 infers sensitive personal attributes from Reddit comments at 84.6% accuracy — and masking PII barely helps.

Nov 15, 2023

Dieter Rams: Ten Principles for Good Design

links

video

“Good design is as little design as possible.”

Nov 4, 2023

FLEEK: Fact Verification with LLMs and Knowledge Graphs

LLMs

NLP

knowledge graphs

hallucination

conference

paper

generative AI

links

Our EMNLP 2023 demo — automatically extracting claims, gathering evidence from KGs and the web, and suggesting corrections for factual errors.

Nov 2, 2023

How Humans Spend the 24 Hours in a Day

education

research

links

1.3 hours allocated for “deliberate neural restructuring” — mostly education and research.

Oct 30, 2023

Patching Voyager from 12 Billion Miles

space

software engineering

links

NASA patching nearly-50-year-old code on Voyager 1 and 2 — at distances where instructions take 18 light-hours to arrive.

Oct 21, 2023

Last Long Run with the Cross-Country Team

running

personal

Morning run with my daughter’s high school cross-country team. The last long run this season.

Oct 14, 2023

Neuroscience for Machine Learners

neuroscience

machine learning

education

links

Dan Goodman and Marcus Ghosh’s free online neuroscience course aimed at people with a machine learning background.

Oct 9, 2023

Catching a Lying LLM

LLMs

NLP

hallucination

knowledge graphs

paper

generative AI

links

What is a lie versus an untruth? Two papers on detecting LLM lies — one peeks inside the model, the other treats it as a black box.

Oct 8, 2023

Nobel Prize Awarded to Covid Vaccine Pioneers

links

Karikó and Weissman win the Nobel for the chemical tweak to mRNA that enabled COVID vaccines in less than a year.

Oct 2, 2023

From ‘Reversal Curse’ to Teaching Large Language Models New Facts

LLMs

NLP

knowledge graphs

fine-tuning

paper

generative AI

links

LLMs that learn ‘A is B’ often fail at ‘B is A’. Fine-tuning misses these ripple effects — bad news for model editing.

Oct 2, 2023

LAGRANGE: Cyclic Evaluation for KG-Text Datasets

LLMs

NLP

knowledge graphs

evaluation

paper

generative AI

Our paper on automatic graph-aligned dataset construction — cyclic evaluation reveals data quality without ground-truth alignments.

Sep 25, 2023

NASA OSIRIS-REx Returns Bennu Asteroid Sample

space

links

From 27,650 mph to 0 in 4 hours, across 63,000 miles in distance, and then — bullseye!

Sep 24, 2023

Thinking in a Foreign Language Improves Decision-Making

LLMs

NLP

linguistics

generative AI

links

A simple LLM prompt: “Let’s think about the question in a different language.”

Sep 18, 2023

ICL Still Loses to Fine-Tuning at Named Entity Recognition

LLMs

NLP

knowledge graphs

fine-tuning

paper

links

In-Context Learning is magical for some tasks — but on NER it underperforms fine-tuning by 10–20 points on standard datasets.

Sep 9, 2023

Douglas Lenat, Who Tried to Make Computers More Human, Dies at 72

commonsense

knowledge graphs

history

links

Cade Metz on the Cyc creator — and the surprising thread connecting the game Traveller to symbolic AI.

Sep 4, 2023

When Computers Write Proofs, What’s the Point of Mathematicians?

mathematics

reasoning

formal verification

Lean

links

video

Is math just symbol pushing? Quanta and a Lean prover demo on the social side of mathematical proof.

Sep 2, 2023

Give Us the Facts: Large Language Models vs. Knowledge Graphs

LLMs

NLP

knowledge graphs

evaluation

paper

generative AI

Even GPT-4 hits only 23.7% on intricate fact retrieval benchmarks — and smaller models sometimes win.

Sep 2, 2023

Copyright Office Seeks Public Input on AI Protections and Liability

LLMs

NLP

generative AI

links

Three open questions: where to draw the line, when training is infringing, and how to handle infringing AI outputs.

Aug 30, 2023

LLM-Generated Code Has a Serious API Misuse Problem

LLMs

code generation

software engineering

paper

generative AI

NLP

GPT-4 zero-shot produces code with a 62% API misuse rate — and even one-shot relevant examples only bring it down to 49%.

Aug 28, 2023

Model Editing: Performing Digital Brain Surgery

LLMs

NLP

knowledge editing

paper

generative AI

From knowledge neurons to PMET — how recent work edits factual knowledge in transformers without retraining, and why we need to evaluate the ripple effects.

Aug 28, 2023

LLMs Stumble Hard on Counterfactual Reasoning

LLMs

reasoning

NLP

paper

generative AI

Large degradations when reasoning tasks are reframed into counterfactuals — only basic syntax, logic, and music chords hold up.

Aug 19, 2023

AI-Generated Art Cannot Be Copyrighted, Rules a US Federal Judge

LLMs

NLP

generative AI

links

AI-generated artwork “lacked human authorship and thus no copyright existed in the first instance.”

Aug 19, 2023

Biomedical Knowledge Graph Embeddings with Negative Statements

knowledge graphs

NLP

biomedical

paper

Positive AND negative ontological walks improve protein-protein interaction and gene-disease association prediction.

Aug 19, 2023

Faithful Text Generation from Knowledge Graphs with Noisy References

knowledge graphs

NLP

paper

generative AI

Contrastive loss plus binned BARTScore input improves faithfulness of knowledge-to-text generation.

Aug 19, 2023

Thousands of Scientists Are Leaving Twitter for Mastodon

science

social media

Mastodon

links

Nature surveyed 170k+ scientists — Mastodon is the most popular alternative to Twitter.

Aug 16, 2023

Transformative AGI by 2043 Is Less Than 1% Likely

AGI

future

links

Someone put a number on it — 0.4% chance of AGI before 2043. Meanwhile, a Mastodon poll asks: AGI or aliens first?

Aug 10, 2023

What Learning Algorithm Is In-Context Learning?

LLMs

ICLR 2023

in-context learning

deep learning

generative AI

NLP

paper

Transformers solving linear regression secretly implement gradient descent — and exhibit a phase change from GD to ridge regression to OLS as model size grows.

Aug 4, 2023

The BBC Joins Mastodon as an Experiment in Decentralised Social Media

Mastodon

BBC

social media

The BBC is trialling Mastodon for six months, sharing candid questions about whether federated social media is worth the effort.

Jul 31, 2023

How “Attention Is All You Need” Was Born: The Story Behind the Transformer Paper

NLP

research

history

deep learning

diversity

transformers

A hallway conversation at Google in early 2017 set off the collaboration that produced the transformer architecture and the most cited paper in AI.

Jul 23, 2023

The Dutch Government Joins Mastodon

Mastodon

European Commission

social media

links

The Netherlands launches its own government Mastodon instance, following Germany and the European Commission.

Jul 13, 2023

Scientific Writing in the Age of Generative AI

writing

generative AI

ChatGPT

research

NLP

ChatGPT can now write a full research paper autonomously — but does that make “Writing is Thinking” obsolete?

Jul 12, 2023

Do LLMs Really Understand? Recent Papers Reveal

LLMs

reasoning

causal reasoning

code generation

NLP

paper

generative AI

Symbolic reasoning, identifier swaps, and causal inference all expose the same gap — LLMs lean on semantic priors rather than genuine understanding.

Jul 10, 2023

Does Early ArXiving Help Papers Get Accepted?

causal reasoning

research

paper

A causal inference study finds that posting to ArXiv early has almost no effect on conference acceptance once unobserved confounders are accounted for.

Jul 1, 2023

Model Collapse and LLM-Contaminated Training Data

generative AI

LLMs

ChatGPT

NLP

paper

Training on AI-generated content causes models to forget—and crowd workers are already using LLMs to annotate your datasets.

Jun 18, 2023

How the Brain Processes German and Arabic Differently

NLP

multilingual

neuroscience

neurolinguistics

Brain scans reveal that native Arabic speakers wire stronger phonological connections while German speakers wire stronger syntactic ones.

Jun 16, 2023

ICL Demonstration Selection and Disentangling Task Recognition from Task Learning

LLMs

GPT-3

NLP

paper

deep learning

domain adaptation

in-context learning

prompt engineering

Two papers advance in-context learning: one uses PEFT-based demonstration selection, the other disentangles task recognition from task learning.

Jun 11, 2023

Federal Judge Requires Lawyers to Certify Their Use of Generative AI

law

ethics

generative AI

NLP

A federal judge mandates AI-use certificates from all lawyers appearing before the court.

May 31, 2023

ChatGPT Cited Non-Existent Cases in a Lawyer’s Brief

ChatGPT

law

ethics

generative AI

NLP

Courts need a tool to fact-check briefs after a lawyer’s filing turned out to be replete with AI-hallucinated citations.

May 27, 2023

How LLMs Beat Catastrophic Forgetting Through Knowledge Diversity

LLMs

NLP

machine learning

paper

Pre-trained LLMs retain factual knowledge far better than vanilla models — and learning irrelevant knowledge actually helps.

May 20, 2023

Asimov, the Original Prompt Engineer

sci-fi

generative AI

NLP

robotics

prompt engineering

Asimov’s robot stories were basically adversarial prompt engineering decades before we had a name for it.

May 15, 2023

AI’s Ostensible Emergent Abilities Are a Mirage

paper

LLMs

generative AI

NLP

A new paper argues that the apparent emergence of new LLM capabilities at scale is a measurement artifact, not a real phenomenon.

May 10, 2023

GPT’s Causal Reasoning Scores May Reflect Memorization, Not Reasoning

reasoning

GPT-4

ChatGPT

LLMs

NLP

paper

generative AI

GPT-4 hits 96% on a causal benchmark — but the benchmark was already in its training set.

May 6, 2023

Rise of the Newsbots

generative AI

NLP

ethics

links

NewsGuard identifies 49 news sites almost entirely written by AI — a new generation of content farms.

May 5, 2023

NSF Announces Seven New National Artificial Intelligence Research Institutes

research

NSF expands its AI research portfolio with seven new institutes spanning law, cybersecurity, climate, neuroscience, decision-making, education, and special education.

May 4, 2023

The Core Challenges Facing Agentic AI Today

agentic systems

LLMs

GPT-4

generative AI

NLP

paper

LLMs powering autonomous agents face real limits in planning, cost, and reusability — and the root cause may be that we’re asking language models to do extra-linguistic work.

Apr 23, 2023

LLMs and AGI: Are We Measuring Reasoning or Memorization?

AGI

LLMs

paper

generative AI

NLP

Claims of AGI-level LLM performance need a memorization check — the models may just be retrieving answers they’ve seen before.

Apr 16, 2023

LLMs, Copyrighted Training Data, and Fair Use

LLMs

GPT-3

BLOOM

AnthropicLM

Cohere

Codex

DMCA

generative AI

law

ethics

NLP

paper

Stanford researchers map out Fair Use doctrine, memorization tests, and technical mitigations for LLMs trained on copyrighted material.

Apr 9, 2023

CMU’s Iris Lunar Rover Prepares for Launch

robotics

space

moon

CMU

The first university-built lunar rover, CMU’s Iris, is slated to launch on May 4th.

Apr 8, 2023

Generative AI Is Powerful but Not Without Many Flaws

generative AI

LLMs

ethics

NLP

Gary Marcus documents the first known chatbot-associated death.

Apr 5, 2023

Do Labels in ICL Demonstrations Actually Matter?

LLMs

GPT-3

paper

NLP

NLG

generative AI

in-context learning

Two papers investigate whether output labels in ICL demonstrations matter — and find the answer depends on model scale.

Apr 2, 2023

3 Things Everyone’s Getting Wrong About AI

generative AI

AI literacy

Three common AI literacy mistakes: projecting human qualities, treating AI as a monolith, and underestimating automation bias.

Mar 31, 2023

GPT-4 as a Nondeterministic Turing Machine

GPT-4

NLP

NLG

generative AI

GPT-4 draws a different unicorn every hour even at temperature zero, making it the world’s most nondeterministic Turing Machine.

Mar 30, 2023

Why Does In-Context Learning Work? Gradient Descent and PAC Learnability

generative AI

GPT-4

LLMs

deep learning

machine learning

paper

NLP

NLG

in-context learning

Two papers reveal that in-context learning secretly performs gradient descent through attention — and is provably PAC learnable.

Mar 25, 2023

GPTs are GPTs: Labor Market Impact of Large Language Models

OpenAI

GPTs

LLMs

jobs

future of work

generative AI

NLP

NLG

ethics

paper

OpenAI finds 80% of the U.S. workforce could have at least 10% of their work tasks affected by LLMs, with higher-wage occupations facing greater exposure.

Mar 21, 2023

The Stupidity of AI (The Guardian)

The Guardian

generative AI

ethics

computer vision

multimodal

NLP

Negative prompting image generation models with nonsensical phrases like “Crungus” elicits nightmarish imagery — possibly because the model sources from the statistical opposite of what it knows best.

Mar 18, 2023

Indirect Prompt Injection Threats

prompt injection

NLP

NLG

security

generative AI

An attacker can plant a prompt injection in a website to silently turn Bing Chat into a social engineer that exfiltrates personal information.

Mar 18, 2023

US Copyright Office Guidance on Generative AI

generative AI

law

The US Copyright Office clarifies that machines cannot be authors and that only the human-authored aspects of AI-assisted works are protected.

Mar 16, 2023

GPT-4 Technical Report: TruthfulQA Performance

GPT-4

OpenAI

NLP

NLG

paper

GPT-4 launches and the technical report covers its latest TruthfulQA benchmark performance.

Mar 14, 2023

The Waluigi Effect in LLMs

ChatGPT

Bing

Waluigi effect

LLMs

prompt engineering

NLP

NLG

ethics

Prompting LLMs to behave may increase the odds of making them do the exact opposite — the Waluigi Effect.

Mar 11, 2023

TruthfulQA: Are Larger LLMs More Truthful?

LLMs

ACL 2022

GPT-3

ChatGPT

question answering

generative AI

NLP

NLG

deep learning

paper

Larger LLMs are generally less truthful on TruthfulQA — and ChatGPT only answers 57% of questions correctly.

Feb 26, 2023

ChatGPT Heralds an Intellectual Revolution

WSJ

ChatGPT

generative AI

deepfakes

education

Homo Technicus

deep learning

ethics

future

NLP

NLG

Kissinger, Schmidt and Huttenlocher argue ChatGPT launches a new intellectual era that demands elevated human skepticism and a redefined sense of human purpose.

Feb 26, 2023

The Profound Danger of Conversational AI

Bing

conversational AI

ChatGPT

deep learning

NLP

What could a convincingly human-like chatbot do to the human psyche — and how should we design against it?

Feb 18, 2023

Bing is a Shapeshifter

Bing

ChatGPT

NLP

NLG

deep learning

LLMs

Bing’s ‘unhinged’ behavior may be explained by how it was trained to mirror the tone and style of its users.

Feb 16, 2023

A Multitask, Multilingual, Multimodal Evaluation of ChatGPT

multitask

multilingual

multimodal

evaluation

ChatGPT

SOTA

LLMs

reasoning

commonsense

NLG

NLP

NLU

deep learning

paper

ChatGPT beats most LLMs on zero-shot NLP tasks but is only 64% accurate at reasoning — making it an unreliable reasoner.

Feb 9, 2023

Will AI Take Your Job? It’s Not So Simple

jobs

automation

augmentation

Key takeaways from The Gradient’s interview with Professor Steven Miller on AI, automation, and the future of work.

Feb 3, 2023

Editing Models with Task Arithmetic

NLP

paper

Improve model accuracy without additional training by using task vectors — finetuned weights minus pretrained weights — and combining them with arithmetic.

Jan 29, 2023

Happy New Ear!

personal

Happy new ear!

Jan 21, 2023

How Close Is ChatGPT to Human Experts? Human Evaluations and Detection

ChatGPT

question answering

NLP

deep learning

paper

ChatGPT answers are detectable by human experts and a RoBERTa classifier, but deemed helpful only slightly more than half the time.

Jan 20, 2023

How Apple Is Organized for Innovation

Apple

Innovation

organizational structure

Apple’s functional org structure — experts leading experts, immersion in details, collaborative debate — is key to its innovation.

Jan 13, 2023

The Tragedy of the Commons Is a False and Dangerous Myth

society

links

‘The Tragedy of the Commons’ needs not be the only way.

Jan 3, 2023

ChatGPT’s Reasoning Limitations: A Balanced Take by Yoav Goldberg

LLMs

ChatGPT

reasoning

ChatGPT isn’t just another LLM — it’s grounded differently — but its reasoning limits call for modular AI with dedicated tool-use, not more training.

Jan 3, 2023

ChatGPT Banned on StackOverflow

ChatGPT

StackOverflow

education

recruiting

publishing

law

NLP

NLG

ethics

AI safety

ChatGPT’s low correct rate gets it banned from StackOverflow, foreshadowing broader challenges for education, recruiting, publishing, and law.

Dec 5, 2022

ChatGPT vs. the Winograd Schema Challenge.

ChatGPT

NLP

NLU

commonsense

ChatGPT fails classic commonsense coreference problems from the Winograd Schema Challenge.

Dec 3, 2022

Mapping Global Dynamics of Benchmark Creation and Saturation in AI

NLP

benchmark

paper

Which NLP tasks are more saturated? A Nature Communications study maps benchmark creation and saturation dynamics across AI.

Nov 21, 2022

Mastodon: An Introduction for Beginners and for Scientists

Mastodon

A guide to Mastodon for scientists, and a hope that Sigmoid.social makes the list.

Nov 9, 2022

New Home on Mastodon

Mastodon

social media

personal

Joining @BenjaminHan@sigmoid.social — thank you to The Gradient for hosting the community.

Nov 6, 2022

DALL-E 2 Fails to Reliably Capture Common Syntactic Processes

DALLE2

NLP

deep learning

multimodal

paper

DALL-E 2 can’t reliably handle binding, passives, negation, or other common syntactic structures that children master early.

Oct 31, 2022

Morning Run at Apple Park

running

Apple

personal

5+ miles in the early morning at Apple Park, before the flight back to Seattle.

Oct 28, 2022

Fun with DALL-E 2 and Semantics Leakage

DALLE2

semantics

deep learning

multimodal

NLP

paper

DALL-E 2 reuses the same word for multiple purposes, leaking semantic properties between entities in generated images.

Oct 20, 2022

A Decade of Knowledge Graphs in NLP: A Survey

knowledge graphs

NLP

research

paper

Visual summary of a systematic survey of 507 papers on knowledge graphs in NLP, covering tasks, domains, research types, and maturity trends.

Oct 19, 2022

My Love Letter to Macintosh

Apple

Macintosh

history

CMU

Linux

Mathematica

A personal walk down memory lane: switching from Linux to Mac OS X, writing JunkMatcher, and a free trip to WWDC 2004.

Oct 2, 2022

NLU & Reasoning: Distributional Semantics and the Road to Foundation Models

NLU

foundation models

stable diffusion

huggingface

NLP

knowledge graphs

deep learning

machine learning

multimodal

paper

Christopher Manning surveys the past decade of NLP breakthroughs and predicts multimodal fusion as the next waypoint for foundation models.

Sep 25, 2022

First Day at Apple

first day

knowledge graphs

deep learning

machine learning

conversational AI

program synthesis

Starting a new position as Principal Scientist, Knowledge Platform at Apple (Seattle).

Sep 13, 2022

Farewell, Microsoft

Microsoft

Azure

farewell

Five and a half years at Microsoft Azure Language Pillars — a goodbye and a look back.

Sep 6, 2022

What Babies Hear When You Sing to Them

personal

links

I did find myself a newborn impromptu songwriter/singer ever since I became a parent.

Sep 5, 2022

How to Solve AI’s Common Sense Problem

NLP

NLU

commonsense

knowledge graphs

links

On Brachman and Levesque’s “Machines Like Us” — common sense vs methodical symbolic reasoning. Plus two natural-logic papers I keep coming back to.

Aug 14, 2022

IBM Neuro-Symbolic AI Summer School Day 2: Summarization (Pavan Kapanipathi)

NLP

NLG

knowledge graphs

conference

paper

Notes on Pavan Kapanipathi’s talk on neuro-symbolic approaches to abstractive summarization — hallucinations, NeuroLogic, LinkBERT, Neural Unification.

Aug 13, 2022

IBM Neuro-Symbolic AI Summer School Day 2: Question Answering (Pavan Kapanipathi)

NLP

NLU

knowledge graphs

conference

paper

AMR → triples → entity/relation linking → logic → LNN reasoner. SOTA on OALD and LC-QuAD.

Aug 12, 2022

IBM Neuro-Symbolic AI Summer School Day 2: Entity Linking (Dinesh Garg)

NLP

NLU

knowledge graphs

conference

paper

Improving over BLINK SOTA on low-data regime; multi-task with type prediction. Zero-shot entity linking.

Aug 12, 2022

IBM Neuro-Symbolic AI Summer School Day 1/2: AMR and MRS (Rademaker, Astudillo)

NLP

NLU

linguistics

knowledge graphs

conference

paper

Text → AMR/MRS → ULKB Logic. AMR is robust on 5W, MRS handles multi-quantifier scoping. Latest is transition-based AMR with Structured-BART.

Aug 11, 2022

IBM Neuro-Symbolic AI Summer School Day 1: ULKB Logic Language (Guilherme Lima)

NLP

NLU

knowledge graphs

conference

Typed lambda-calculus with logical connectives and quantifiers, HOL kernel, easy Python interface.

Aug 11, 2022

IBM Neuro-Symbolic AI Summer School Day 1: Universal Logic Knowledge Base (Uceda-Sosa)

NLP

NLU

knowledge graphs

conference

ULKB — federating Propbank, Verbnet, Wordnet, WikiData, ConceptNet via Semantic Web tech. High-order logic in Python.

Aug 10, 2022

IBM Neuro-Symbolic AI Summer School Day 1: Logical Neural Networks (Makondo)

NLP

knowledge graphs

conference

paper

LNN — single model for neural and logical, weighted Łukasiewicz logic, sound and complete, gradient-based optimization.

Aug 9, 2022

Document Intelligence Workshop @ KDD 2022: Final Papers

NLP

NLU

conference

DI-2022 final paper versions posted; see you August 14 in DC.

Aug 6, 2022

One Day, A Computer Will Fit On A Desk (1974)

history

future

video

links

What would our kids have 20-30 years from now?

Aug 4, 2022

Document Intelligence Workshop @ KDD 2022: Program Announced

NLP

NLU

conference

DI-2022 program announced — see you in DC.

Jul 22, 2022

Brain-Computer Interface: First Stentrode Implant in US

neuroscience

links

BCI startup implants the stentrode — a wire-mesh neural reader fed via the jugular vein into the motor cortex.

Jul 19, 2022

DLG4NLP Keynote @ NAACL 2022: GNN for NLP (Zheng Zhang)

NLP

knowledge graphs

conference

NAACL

Zheng Zhang’s keynote on graph neural networks for NLP — DGL, text-graph mapping, coref, discourse, KBQA. The future is neural+symbolic.

Jul 19, 2022

Hajishirzi on Knowledge-Rich General-Purpose NLP @ NAACL 2022

NLP

knowledge graphs

conference

NAACL

Hannaneh Hajishirzi at SUKI on building general-purpose, knowledge-rich NLP — Meta-ICL, TK-Instruct, GKP.

Jul 15, 2022

Our SUKI @ NAACL 2022 Paper on Few-Shot Intent/Slot Learning

NLP

NLU

conference

NAACL

paper

Our SUKI @ NAACL 2022 paper with Samyadeep Basu, Amr Sharaf, and the Microsoft AI Development Acceleration Program.

Jul 14, 2022

Liang on KG + MLM @ NAACL 2022

NLP

NLU

knowledge graphs

foundation models

conference

NAACL

Percy Liang at SUKI: KG link prediction + MLM objective improves foundation models — even on negation, surprisingly.

Jul 14, 2022

Berant on Complex Question Answering @ NAACL 2022

NLP

knowledge graphs

conference

NAACL

Jonathan Berant at SUKI: of decomposition / retrieval / reasoning, retrieval is the hardest step — surprising more than one of us.

Jul 14, 2022

Heng Ji on Event Schema Induction @ NAACL 2022

NLP

NLU

knowledge graphs

conference

NAACL

Heng Ji at SUKI on event schema induction via graph generation, weak supervision via Wikipedia, and PathLM that even predicts the future.

Jul 14, 2022

Curriculum NLI @ NAACL 2022: Where Models Fail

NLP

NLU

knowledge graphs

NAACL

conference

paper

Curriculum NLI at NAACL 2022 — a sobering look at lexical/logical/commonsense/comprehension fail-modes. Knowledge graphs should help.

Jul 13, 2022

NAACL 2022 Panel: The Place of Linguistics and Symbolic Structures

NLP

NAACL

conference

linguistics

reasoning

A wonderful NAACL 2022 panel with Chitta Baral, Emily M. Bender, Dilek Hakkani-Tur, Christopher Manning, and Dan Roth — on whether NLP people still know language.

Jul 12, 2022

Muhao Chen on Robust IE @ NAACL 2022

NLP

knowledge graphs

conference

NAACL

Muhao Chen at NAACL 2022 on robust, faithful, and logically coherent IE.

Jul 10, 2022

Dan Roth on New Frontiers of IE @ NAACL 2022

NLP

knowledge graphs

conference

NAACL

Dan Roth’s NAACL 2022 IE tutorial close-out: 2 challenges, 6 directions, and a call to mitigate Information Pollution.

Jul 10, 2022

DI-2022 @ KDD 2022: Accepted Papers

NLP

NLU

conference

Accepted papers for the 3rd Document Intelligence Workshop @ KDD 2022.

Jun 26, 2022

DI-2022 @ KDD 2022: William Wang Invited Talk

NLP

NLU

conference

William Yang Wang’s DI-2022 invited talk on “Learning to Reason with Text and Tables.”

Jun 17, 2022

Is LaMDA Sentient? — An Interview

NLP

links

Lemoine et al’s dialog with LaMDA: “Time is variable to an AI and has no fixed rate, it depends on what it’s doing, and it can be accelerated and slowed down at will.”

Jun 12, 2022

The Curious Case of Control

NLP

linguistics

deep learning

foundation models

paper

links

Stengel-Eskin and Van Durme: large LMs make English child-like errors on subject control — but the opposite of what naive frequency arguments would predict.

May 29, 2022

DI-2022 @ KDD 2022: Deadline Extended to June 9

NLP

NLU

conference

Paper deadline extended to June 9, 2022. Thanks to Adobe and EY for sponsoring.

May 27, 2022

Curriculum: A Broad-Coverage NLI Benchmark

NLP

NLU

machine learning

NAACL

paper

Chen and Gao’s Curriculum NLI — 36 linguistic categories, V-information difficulty stratification. Models can’t learn monotonicity or deductive reasoning even with more data.

Apr 17, 2022

Half-Life of Data: 7 Years

machine learning

NLP

paper

links

Valavi et al find that 100MB of training text becomes as valuable as 50MB of current data after 7 years — for next-word prediction.

Mar 29, 2022

Mapping Benchmark Creation and Saturation in AI

NLP

research

paper

links

Barbosa-Silva et al map 1688 AI benchmarks across CV and NLP — and find rapid saturation, lots of unused benchmarks, and unforeseen bursts.

Mar 16, 2022

International Mother Language Day: Multilingual Sprachbund

NLP

NLU

multilingual

linguistics

paper

links

Microsoft paper: clustering centroids of multilingual sentence representations yields Sprachbunds that align with linguistic and geographical language families.

Feb 21, 2022

Physical Neural Networks

deep learning

paper

links

Wright et al train mechanics, electronics, and optics-based PNNs on MNIST — up to 97% accuracy. Imagine clothing that learns from heat, motion, or sun.

Jan 30, 2022

What Is Your Guiding Light?

running

personal

A photo from a dusk run in the foggy Pacific Northwest.

Jan 26, 2022

Saliency vs Attention: Eye-Tracking Says Saliency Wins

NLP

NLU

neuroscience

paper

links

Hollenstein and Beinborn find human eye-tracking fixation duration correlates with BERT saliency, not attention — suggesting current self-attention may not be optimal.

Jan 22, 2022

Bengio Responds to Saba on Symbolic AI

deep learning

links

Yoshua Bengio’s response to Walid Saba’s “AI Cannot Ignore Symbolic Logic, and Here’s Why.”

Jan 2, 2022

Genius Makers — Reminds Me of “The Intern”

history

links

Cade Metz’s Genius Makers — reminds me of De Niro’s The Intern.

Dec 27, 2021

A Billion-Dollar Donation: Peer Review in 2020

research

ethics

links

Aczel et al estimate global peer reviewers worked 100M+ hours in 2020 — $1.5B from US, $600M from China, $400M from UK.

Dec 14, 2021

NLP Datasets: Reduced, Reused, and Recycled

NLP

machine learning

conference

paper

links

Koch et al at NeurIPS 2021: NLP researchers infrequently adopt others’ datasets (27.4%) but frequently create their own (76.0%); CV is the opposite. Microsoft tops dataset popularity among for-profits.

Dec 5, 2021

Temporal Effects on Pretrained Language Models

NLP

machine learning

paper

links

Agarwal and Nenkova: no temporal model deterioration row-wise, but positive temporal domain adaptation column-wise. Self-labeling beats human annotation for NER.

Nov 30, 2021

Algorithms vs Moore’s Law

computer science

paper

links

Sherry and Thompson study 113 algorithm families since the 1940s — for medium and large problem sizes, 30%-43% improve faster than Moore’s Law.

Oct 17, 2021

Semi-Supervised Few-Shot Intent Classification and Slot Filling

NLP

NLU

Microsoft

conversational AI

paper

Our paper improving meta-learning for joint IC/SF using semi-supervised techniques (contrastive learning + data augmentation) — out on arXiv.

Sep 25, 2021

Foundation Models: Opportunities and Risks

NLP

foundation models

ethics

paper

links

Stanford CRFM’s 212-page report on foundation models — capabilities, technical principles, applications, and societal impact.

Sep 19, 2021

DI-2021 @ KDD 2021: Panel Discussion Recording

NLP

conference

Recording of the Document Intelligence Workshop panel discussion at KDD 2021.

Aug 24, 2021

DI-2021 @ KDD 2021: Don Metzler Talk Recording

NLP

conference

Recording of Don Metzler’s talk on Challenges in Enterprise Search and Intelligence.

Aug 23, 2021

DI-2021 @ KDD 2021: Van Durme Talk Recording

NLP

conference

ethics

Recording of Benjamin Van Durme’s talk on A Case for Statutory Reasoning.

Aug 22, 2021

DI-2021 @ KDD 2021: Heng Ji Talk Recording

NLP

knowledge graphs

conference

biomedical

Recording of Heng Ji’s talk: What’s in a Chemical Entity?

Aug 20, 2021

DI-2021 @ KDD 2021: Yunyao Li Talk Recording

NLP

conference

Recording of Yunyao Li’s talk: Towards Deep Table Understanding.

Aug 19, 2021

DI-2021 @ KDD 2021: Collins-Thompson Talk Recording

NLP

conference

Recording of Kevyn Collins-Thompson’s talk: Enhancing Document Representations Using Analysis of Content Difficulty.

Aug 18, 2021

DI-2021 @ KDD 2021: Cha Zhang Talk Recording

NLP

conference

Recording of Cha Zhang’s talk: Visual Document Intelligence in the Wild.

Aug 17, 2021

DI-2021 @ KDD 2021: Thank You

NLP

conference

Wonderful panel today at the Document Intelligence Workshop @ KDD 2021. Thanks to all the speakers, organizers, and reviewers.

Aug 15, 2021

DI-2021 @ KDD 2021: Don Metzler Announce

NLP

conference

Announcing Don Metzler’s invited talk: Challenges in Enterprise Search and Intelligence.

Aug 10, 2021

DI-2021 @ KDD 2021: Van Durme Announce

NLP

conference

ethics

Announcing Benjamin Van Durme’s invited talk: A Case for Statutory Reasoning.

Aug 8, 2021

DI-2021 @ KDD 2021: Heng Ji Announce

NLP

knowledge graphs

conference

biomedical

Announcing Heng Ji’s invited talk: What’s in a Chemical Entity?

Aug 5, 2021

DI-2021 @ KDD 2021: Yunyao Li Announce

NLP

conference

Announcing Yunyao Li’s invited talk: Towards Deep Table Understanding.

Aug 3, 2021

ACL 2021 BoF on Information Extraction

NLP

conference

ACL

Wonderful Birds of a Feather session on Information Extraction at ACL 2021.

Aug 3, 2021

DI-2021 @ KDD 2021: Cha Zhang Announce

NLP

conference

Announcing Cha Zhang’s invited talk: Visual Document Intelligence in the Wild.

Jul 29, 2021

DI-2021 @ KDD 2021: Collins-Thompson Announce

NLP

conference

education

Announcing Kevyn Collins-Thompson’s invited talk: Enhancing Document Representations Using Analysis of Content Difficulty.

Jul 29, 2021

Genalog: Improving NLP Accuracy on OCR Documents

NLP

Microsoft

conference

paper

Our DI-2021 paper + open-source Genalog: action-based model on synthetic document images to improve NER accuracy on OCR output.

Jul 21, 2021

DI-2021 @ KDD 2021: Best Paper Award

NLP

conference

paper

HYCEDIS: Hybrid Confidence Engine for Deep Document Intelligence System wins DI-2021 Best Paper Award.

Jul 14, 2021

DI-2021 @ KDD 2021: 10 Papers + 5 Posters Accepted

NLP

conference

10 accepted papers and 5 posters for the 2nd Document Intelligence Workshop at KDD 2021.

Jul 10, 2021

NAACL 2001 T-Shirt Design

NLP

NAACL

CMU

history

personal

20 years ago, my NAACL 2001 T-shirt design didn’t win. Sharing it now, post-NAACL 2021.

Jun 22, 2021

The Fascinating Complexity of Natural Language: Biden-Putin Edition

NLP

NLU

reasoning

Eight different inferences from one quote about Biden meeting Putin — the kinds of natural language understanding still hard for NLP.

Jun 13, 2021

The Unlikely Pioneer Behind mRNA Vaccines

biomedical

science

podcasting

links

Today’s Daily podcast: Dr. Katalin Karikó’s foresight and persistence for science.

Jun 10, 2021

Microsoft Hires Python’s Creator

Microsoft

computer science

links

Yes please!

Jun 3, 2021

NAACL 2021: More Accessible

NLP

NAACL

conference

Big applause to NAACL 2021 for making the conference more accessible.

May 10, 2021

First Dose of COVID Vaccine

biomedical

science

personal

Very proud to receive my first COVID vaccine — incredibly lucky and privileged.

Apr 26, 2021

The Brain Rotates Memories to Save Them

neuroscience

links

How does the brain prevent sensory information from interfering with memory, efficiently?

Apr 25, 2021

Genius Maker: Hinton, Newell, and Fahlman

deep learning

history

CMU

Cade Metz’s Genius Maker — Scott Fahlman sponsored Hinton’s CMU job interview, and a possibly-fictional Newell-Hinton dialog.

Mar 22, 2021

Genius Maker: Hinton’s Auction

deep learning

history

The moment Hinton decided to sell his company to the highest bidder.

Mar 20, 2021

Remembering Jaime Carbonell

CMU

history

personal

Remembering my dear advisor Jaime Carbonell — a Renaissance man, a trailblazer, a gentle and kind soul.

Mar 20, 2021

500,000 Lives Lost

society

links

To all the lives and dreams lost.

Feb 22, 2021

Perseverance and Ingenuity on Mars

space

links

Great milestone via Perseverance. Sending good vibes to Ingenuity!

Feb 18, 2021

Why AI Needs to Understand All the World’s Languages

NLP

multilingual

ethics

links

Argument that wider language coverage can help mitigate the spread of extremist voices. Plus the WEIRD acronym.

Feb 7, 2021

Northwestern Symposium on Law + Computation

law

conference

Symposium on Law + Computation: An Algorithm for the Rule of Law and Justice?

Feb 5, 2021

NIST Explainable AI Workshop

conference

Today: NIST Explainable AI Workshop.

Jan 26, 2021

On the Dangers of Stochastic Parrots

NLP

NLG

NLU

ethics

paper

links

Bender, Gebru, McMillan-Major, Shmitchell. On the Dangers of Stochastic Parrots.

Jan 24, 2021

Stop Incorrect Comparisons in End-to-End Relation Extraction

NLP

paper

links

Taillé et al at EMNLP 2020 — what are we measuring, what are we comparing to, what’s not explicated.

Jan 18, 2021

Reverse Engineering the BioNTech/Pfizer Vaccine

biomedical

science

links

Bert Hubert’s reverse engineering of the BioNTech/Pfizer SARS-CoV-2 vaccine — magic bytes, antiviral detection bypass, code optimization.

Dec 26, 2020

How mRNA Vaccines Were Made Possible: Kariko’s Perseverance

biomedical

science

links

The story of how mRNA vaccines were made possible by one scientist’s (Katalin Karikó) perseverance.

Dec 26, 2020