Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy

Friday, October 17, 2025

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

Feng, Y., et al. (2024, June 11).
arXiv.org.

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation. This raises concerns about \emph{model collapse}, a drop in model performance when their training sets include generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investigate the use of verification on synthesized data to prevent model collapse. We provide a theoretical characterization using Gaussian mixtures, linear classifiers, and linear verifiers to derive conditions with measurable proxies to assess whether the verifier can effectively select synthesized data that leads to optimal performance. We experiment with two practical tasks -- computing matrix eigenvalues with transformers and news summarization with LLMs -- which both exhibit model collapse when trained on generated data, and show that verifiers, even imperfect ones, can indeed be harnessed to prevent model collapse and that our proposed proxy measure strongly correlates with performance.

Here are some thoughts:

Drawing on psychological principles of learning and evaluation, this paper argues that LLMs suffer from "model collapse" not because synthesized data is inherently useless, but because they are poor at self-evaluating quality. Like humans, LLMs can generate good outputs but struggle to reliably identify the best ones among many (e.g., using perplexity). The core insight is that external verification—using even imperfect "verifiers" to select high-quality synthetic examples—is crucial for scaling. This mirrors how human learning benefits from feedback: selection, not perfect generation, is the key. The authors theoretically prove and empirically demonstrate that a simple proxy (p*) measuring a verifier's ability to distinguish good from bad data strongly predicts model performance, showing that leveraging synthesized data with robust selection prevents collapse and can even surpass original models.

Thursday, October 16, 2025

Why Anecdotes Beat Data And Hijack Our Judgment

Chuck Dinerstein
American Council on Science and Health
Originally published 4 Sept 25

While chance plays a role in many, if not all, of our decisions and consequences, its role is both partial and variable. As a result, our understanding of “cause” is ambiguous, which, in turn, distorts our judgments and predictions. It helps to explain why all my achievements come from hard work, while yours were due to luck. To generalize, we all underestimate the role of chance in the outcomes of our actions, viewing our “task performance over time as diagnostic of ability.” 

The research, reported in PNAS Nexus, investigates situations entirely determined by chance, e.g., coin flips, where past performance should have no bearing on future expectations. The study examined how people's expectations and behaviors were affected by actual lucky successes and unlucky failures.

Using both real and virtual coins, participants were asked to predict the outcomes of a sequence of five coin tosses. The researchers observed how the experience of varying degrees of "lucky successes" and "unlucky failures" influenced subsequent expectations and behaviors, anticipating three possible responses.


Here are some thoughts:

In essence, this article provides psychologists with a clear, compelling, and generalizable model for understanding one of the most pervasive and problematic aspects of human cognition: our innate drive to impose order and causality on randomness. It explains why people believe in luck, superstitions, and false cause-and-effect relationships, and why data often fails to change minds. This understanding is foundational for developing better communication strategies, designing effective interventions against misinformation, improving decision-making in high-stakes fields, and ultimately, helping individuals make more rational choices in their personal and professional lives.

Wednesday, October 15, 2025

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Gerstgrasser, M., Schaeffer, R., et al. (2024).
arXiv (Cornell University).

Abstract

The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration until fitted models become useless. However, those studies largely assumed that new data replace old data over time, where an arguably more realistic assumption is that data accumulate over time. In this paper, we ask: what effect does accumulating data have on model collapse? We empirically study this question by pretraining sequences of language models on text corpora. We confirm that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse, then demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse; these results hold across a range of model sizes, architectures, and hyperparameters. We obtain similar results for deep generative models on other types of real data: diffusion models for molecule conformation generation and variational autoencoders for image generation. To understand why accumulating data can avoid model collapse, we use an analytically tractable framework introduced by prior work in which a sequence of linear models are fit to the previous models' outputs. Previous work used this framework to show that if data are replaced, the test error increases with the number of model-fitting iterations; we extend this argument to prove that if data instead accumulate, the test error has a finite upper bound independent of the number of iterations, meaning model collapse no longer occurs.

Here are some thoughts:

This research directly addresses a critical concern for psychologists and researchers who rely on AI: the potential degradation of AI models when they are trained on data generated by previous AI models, a phenomenon known as "model collapse." While prior studies, often assuming old data is discarded and replaced with new AI-generated data, painted a dire picture of inevitable performance decline, this paper offers a more optimistic and realistic perspective. The authors argue that in the real world, data accumulates over time—new AI-generated content is added to the existing pool of human-generated data, not substituted for it. Through extensive experiments with language models, image generators, and molecular modeling tools, they demonstrate that this accumulation of data effectively prevents model collapse. Performance remains stable or even improves across successive generations of models trained on the growing, mixed dataset. The paper further supports this finding with a mathematical proof using a simplified linear model, showing that accumulating data bounds the error, preventing it from growing uncontrollably. For psychologists, this suggests that the increasing presence of AI-generated content on the internet may not catastrophically corrupt future AI tools used in research or clinical settings, as long as training datasets continue to incorporate diverse, original human data alongside synthetic content.

Tuesday, October 14, 2025

Ethical principles for regulatory risk decision-making

Bhuller, Y., et al. (2025).
Regulatory Toxicology and Pharmacology, 105813.

Abstract

Risk assessors, managers, and decision-makers are responsible for evaluating diverse human, environmental, and animal health risks. Although the critical elements of risk assessment and management are well-described in national and international documents, the ethical issues involved in risk decision-making have received comparatively little attention to date. To address this aspect, this article elaborates fundamental ethical principles designed to support fair, balanced, and equitable risk-based decision-making practices. Experts and global thinkers in risk, health, regulatory, and animal sciences were convened to share their lived experiences in relation to the intersection between risk science and analysis, regulatory science, and public health. Through a participatory and knowledge translation approach, an integrated risk decision-making model, with ethical principles and considerations, was developed and applied using diverse, contemporary risk decision-making and regulatory contexts. The ten principles - autonomy, minimize harm, maintain respect and trust, adaptability, reduce disparities, holistic, fair and just, open and transparent, stakeholder engagement, and One Health lens - demonstrate how public sector values and moral norms (i.e., ethics) are relevant to risk decision-making. We also hope these principles and considerations stimulate further discussion, debate, and an increased awareness of the application of ethics in identifying, assessing, and managing health risks.

Here are some thoughts:

This article is critically important for psychologists because it explicitly integrates human values, behavior, and social dynamics into the core of regulatory risk decision-making. While framed for risk assessors and policymakers, the article’s ten ethical principles—such as Autonomy, Minimize Harm, Maintain Respect and Trust, Reduce Disparities, and Stakeholder Engagement—are fundamentally psychological and social constructs. Psychologists possess the expertise to understand how these principles operate in practice: how people perceive and process risk information, how trust is built or eroded through communication, how cognitive biases influence judgment under uncertainty, and how social, cultural, and economic disparities affect vulnerability and resilience. The article’s emphasis on “One Health,” which connects human, animal, and environmental well-being, further demands a systems-thinking approach that psychologists are well-equipped to contribute to, particularly in designing interventions, facilitating stakeholder dialogues, and crafting transparent, culturally appropriate risk communications. By providing a formal ethical framework for decision-making, the article creates a vital bridge for psychologists to apply their science in high-stakes, real-world contexts where human welfare, equity, and ethical conduct are paramount.

Monday, October 13, 2025

End-of-Life Decision Making in Multidisciplinary Teams: Ethical Challenges and Solutions–A Systematic Review

Mujayri, H. et al. (2024).
jicrcr.com.

Abstract

Background: To provide high quality end of life (EOL) care, multidisciplinary teams (MDTs) need to be able to proficiently navigate the intricacies of ethical dilemmas faced by EOL care; to maintain an equilibrium between patient autonomy, family involvement and cultural competence. Yet, the lack of cohesive EOL decision making currently continues to occur because of communication barriers, role ambiguity and a lack of sufficient ethics training within MDTs. As a consequence, these issues demonstrate the necessity of having structured protocols to help MDTs make ethically sound decisions in the EOL care.

Aim: The purpose of this paper is to identify and review major ethical factors that affect ethical decision-making in EOL MDTs, and explore the themes of patient autonomy, communication, cultural sensitivity, ethics training, and institutional barriers.

Method: Ten studies were reviewed systematically according to PRISMA criteria using data sources including PubMed, Scopus, Web of Science, and CINAHL databases. The analysis included studies published between the years 2020 and 2024 and the ethical decision–making challenges and solutions that MDTs face in EOL care contributing to those decisions.

Results: Four key themes were identified: Issues concerning balancing patient autonomy with family input, communication challenges in MDTs, cultural sensitivity in EOL care and the necessity of ethics training. Results indicate that MDTs are often faced with ethical dilemmas when patient’s wishes diverge from those of their family and experience communication difficulties that resulted in degradation of care quality. Simulation is an entertaining and effective way to develop cultural awareness and ethics training in EOL care practice.

Conclusion: Ethical challenges in EOL decision making must be addressed with an intervention encompassing improved ethics training, MDT role clarity, culturally aware practice, and institutional support. These strategies, if implemented will support MDTs in providing patient centered and ethically sound EOL care. Further study of ethics training, communication frameworks and cultural competence on EOL decision-making in MDTs is warranted for future research.

Here are some thoughts:

This article is critically important for practicing psychologists because it directly addresses the core ethical, communicative, and interpersonal challenges they face as integral members of multidisciplinary teams (MDTs) in end-of-life (EOL) care. The systematic review identifies key themes—such as balancing patient autonomy with family input, navigating communication breakdowns within teams, and addressing cultural and religious sensitivities—that are central to a psychologist’s role. Psychologists are often the clinicians best equipped to facilitate difficult family meetings, mediate conflicts between patient wishes and family or team concerns, and ensure that care is culturally competent and patient-centered. The article underscores a significant gap in ethics training and recommends simulation-based learning, urging psychologists to seek or advocate for such training to better handle complex moral dilemmas. Furthermore, by highlighting institutional barriers and role ambiguity, it empowers psychologists to push for clearer team protocols and systemic support, ultimately enabling them to contribute more effectively to ethically sound, compassionate, and collaborative EOL decision-making.

Saturday, October 11, 2025

GDPval: Evaluating AI Model Performance on Real-World Economincally Valuable Tasks

OpenAI. (2025).

We introduce GDPval, a benchmark designed to evaluate how well AI models perform economically valuable tasks in real-world settings. GDPval includes the majority of work activities defined by the U.S. Bureau of Labor Statistics for 44 occupations across the nine sectors that contribute most to U.S. GDP. The tasks in GDPval are based on the actual work of industry professionals who average 14 years of experience.

Our findings show that frontier AI models are improving on GDPval at a roughly linear rate over time. The strongest models now produce deliverables that are approaching the quality of work produced by industry experts. We also examine how pairing frontier models with human oversight could allow these tasks to be completed more quickly and at lower cost than by unaided experts.

Model performance improves further when reasoning effort, task context, and structured guidance are increased. To support future research on real-world AI capabilities, we are releasing a gold-standard subset of 220 tasks and providing a public automated grading service at evals.openai.com.

Here is my brief summary:

This paper introduces GDPval, a new benchmark developed by OpenAI to evaluate AI models on real-world, economically valuable tasks that reflect actual knowledge work across 44 occupations and 9 major U.S. GDP sectors. Unlike traditional academic benchmarks, GDPval emphasizes realism, representativeness, and multi-modality, with tasks based on expert-validated work products that take professionals an average of 7 hours to complete. The evaluation uses pairwise comparisons by industry experts to measure AI performance, finding that top models like Claude Opus 4.1 and GPT-5 are approaching human-level performance in some areas—Claude excels in aesthetics and formatting, while GPT-5 leads in accuracy and instruction-following. The authors open-source a 220-task "gold subset," provide an experimental automated grader, and analyze how factors like reasoning effort, prompting, and scaffolding impact model performance, highlighting both the potential and current limitations of AI in professional workflows.

Friday, October 10, 2025

Ethical challenges and evolving strategies in the integration of artificial intelligence into clinical practice

Weiner, E. B.,  et al. (2025).
PLOS digital health, 4(4), e0000810.

Abstract

Artificial intelligence (AI) has rapidly transformed various sectors, including healthcare, where it holds the potential to transform clinical practice and improve patient outcomes. However, its integration into medical settings brings significant ethical challenges that need careful consideration. This paper examines the current state of AI in healthcare, focusing on five critical ethical concerns: justice and fairness, transparency, patient consent and confidentiality, accountability, and patient-centered and equitable care. These concerns are particularly pressing as AI systems can perpetuate or even exacerbate existing biases, often resulting from non-representative datasets and opaque model development processes. The paper explores how bias, lack of transparency, and challenges in maintaining patient trust can undermine the effectiveness and fairness of AI applications in healthcare. In addition, we review existing frameworks for the regulation and deployment of AI, identifying gaps that limit the widespread adoption of these systems in a just and equitable manner. Our analysis provides recommendations to address these ethical challenges, emphasizing the need for fairness in algorithm design, transparency in model decision-making, and patient-centered approaches to consent and data privacy. By highlighting the importance of continuous ethical scrutiny and collaboration between AI developers, clinicians, and ethicists, we outline pathways for achieving more responsible and inclusive AI implementation in healthcare. These strategies, if adopted, could enhance both the clinical value of AI and the trustworthiness of AI systems among patients and healthcare professionals, ensuring that these technologies serve all populations equitably.

Here are some thoughts:

This article is important for psychologists because it highlights the critical ethical challenges surrounding patient trust, consent, and human-AI interaction in clinical settings—areas central to psychological practice. It details how patient demographics influence trust in AI and emphasizes the need for empathetic, transparent communication from AI systems to address patient anxieties and perceptions of "uniqueness neglect." Furthermore, it discusses "automation bias," where clinicians may overly rely on AI, a phenomenon psychologists must understand to support ethical decision-making and preserve the human-centered, therapeutic aspects of care.

Thursday, October 9, 2025

Turn it and Turn it Again: The Updated Inclusive Model of Ethical Decision Making

McAuliffe, D., & Greenslade, L. (2025).
Ethics and Social Welfare, 1–13.

Abstract

Ethical decision making is a critical skill for practitioners of all disciplines in the social, health and human services. Having capacity to engage proactively with decisions that will impact people’s lives in a way that is rigorous, principled, and considered, is the hallmark of an ethically competent practitioner. There have been multiple models of ethical decision making that have provided structured examples of the questions that should be asked of self and others while navigating an ethical dilemma. The Inclusive Model of ethical decision-making was first published by McAuliffe & Chenoweth in this journal in 2008. In reviewing the Inclusive model some 15 years since its original development, it is timely to reconsider the value of incorporating a 5th ethical platform, conceptualised as Interdependence, to draw on the importance of the relationships between humans, non-humans, and the natural world. This paper provides an extension of previous work to bring the Inclusive model of ethical decision making to a better coherence with current developments in both theory and practice.

Here are some thoughts:

This article presents an updated, practical ethical decision-making model that explicitly incorporates "Interdependence," urging practitioners to consider the impact of their decisions on relationships, non-human animals, and the environment—areas increasingly relevant to holistic client care. The model’s structured, five-step process (defining the dilemma, mapping legitimacy, gathering information, considering alternatives, and critical evaluation) provides a clear, systematic framework for navigating complex real-world dilemmas, which is invaluable in clinical practice. Furthermore, its emphasis on consultation, cultural sensitivity, and critical reflection aligns with core psychological competencies, making it a versatile tool for individual practitioners and interdisciplinary teams.

Wednesday, October 8, 2025

Six Fallacies in Substituting Large Language Models for Human Participants

Lin, Z. (2025).
Advances in Methods and Practices in
Psychological Science, 8(3).

Abstract

Can artificial-intelligence (AI) systems, such as large language models (LLMs), replace human participants in behavioral and psychological research? Here, I critically evaluate the replacement perspective and identify six interpretive fallacies that undermine its validity. These fallacies are (a) equating token prediction with human intelligence, (b) treating LLMs as the average human, (c) interpreting alignment as explanation, (d) anthropomorphizing AI systems, (e) essentializing identities, and (f) substituting model data for human evidence. Each fallacy represents a potential misunderstanding about what LLMs are and what they can tell researchers about human cognition. In the analysis, I distinguish levels of similarity between LLMs and humans, particularly functional equivalence (outputs) versus mechanistic equivalence (processes), while highlighting both technical limitations (addressable through engineering) and conceptual limitations (arising from fundamental differences between statistical and biological intelligence). For each fallacy, specific safeguards are provided to guide responsible research practices. Ultimately, the analysis supports conceptualizing LLMs as pragmatic simulation tools—useful for role-play, rapid hypothesis testing, and computational modeling (provided their outputs are validated against human data)—rather than as replacements for human participants. This framework enables researchers to leverage language models productively while respecting the fundamental differences between machine intelligence and human thought.

Here are some thoughts:

This article critically examines the growing trend of using Large Language Models (LLMs) as direct substitutes for human participants in psychological and behavioral research. While acknowledging that LLMs can generate human-like text and sometimes mirror average human responses, Lin argues that this "replacement perspective" is fundamentally flawed and identifies six key interpretive fallacies that undermine its validity. These fallacies are: equating statistical token prediction with genuine human intelligence; assuming LLM outputs represent an "average human"; interpreting alignment between model and human outputs as evidence of shared cognitive mechanisms; anthropomorphizing AI systems by attributing human mental states to them; essentializing social identities by treating demographic labels as fixed and homogeneous; and directly substituting model-generated data for human evidence without validation. Lin contends that LLMs should be viewed not as replacements, but as pragmatic simulation tools useful for tasks like rapid hypothesis testing, role-playing, and computational modeling—provided their outputs are always validated against real human data. The article emphasizes the fundamental, often conceptual, differences between statistical machine intelligence and biologically grounded, embodied human cognition.