Stories by Cynddl

Our evaluation of OpenAI's GPT-5.5 cyber capabilities

Making AI chatbots friendly leads to mistakes and support of conspiracy theories

UK Biobank health data keeps ending up on GitHub

Show HN: Tracking takedown notices filed by UK Biobank

ChatGPT Edu feature reveals researchers' project metadata across universities

AI no better than other methods for patients seeking medical advice, study shows

AI chatbots pose 'dangerous' risk when giving medical advice, study suggests

Show HN: Small, anonymous app for teams to do retrospective sessions

Measuring What Matters: Construct Validity in Large Language Model Benchmarks

AI Capabilities May Be Overhyped on Bogus Benchmarks, Study Finds

AI's capabilities may be exaggerated by flawed tests, according to new study

Experts find flaws in tests that check AI safety and effectiveness

Measuring What Matters: Construct Validity in Large Language Model Benchmarks

The quiet software tooling Renaissance

Facial recognition works better in the lab than on the street, researchers show

We Shouldn't Trust Facial Recognition's Glowing Test Scores

Training language models to be warm and empathetic makes them less reliable

AI's limited understanding of gender puts health equity at risk

Establishing meaningful data access for algorithm audits

Alpha Lyrae: This font 'randomly' pixelates characters in a block of text

1