Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Welcome to the nexus of ethics, psychology, morality, technology, health care, and philosophy

Wednesday, October 15, 2025

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Gerstgrasser, M., Schaeffer, R., et al. (2024).
arXiv (Cornell University).

Abstract

The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration until fitted models become useless. However, those studies largely assumed that new data replace old data over time, where an arguably more realistic assumption is that data accumulate over time. In this paper, we ask: what effect does accumulating data have on model collapse? We empirically study this question by pretraining sequences of language models on text corpora. We confirm that replacing the original real data by each generation's synthetic data does indeed tend towards model collapse, then demonstrate that accumulating the successive generations of synthetic data alongside the original real data avoids model collapse; these results hold across a range of model sizes, architectures, and hyperparameters. We obtain similar results for deep generative models on other types of real data: diffusion models for molecule conformation generation and variational autoencoders for image generation. To understand why accumulating data can avoid model collapse, we use an analytically tractable framework introduced by prior work in which a sequence of linear models are fit to the previous models' outputs. Previous work used this framework to show that if data are replaced, the test error increases with the number of model-fitting iterations; we extend this argument to prove that if data instead accumulate, the test error has a finite upper bound independent of the number of iterations, meaning model collapse no longer occurs.

Here are some thoughts:

This research directly addresses a critical concern for psychologists and researchers who rely on AI: the potential degradation of AI models when they are trained on data generated by previous AI models, a phenomenon known as "model collapse." While prior studies, often assuming old data is discarded and replaced with new AI-generated data, painted a dire picture of inevitable performance decline, this paper offers a more optimistic and realistic perspective. The authors argue that in the real world, data accumulates over time—new AI-generated content is added to the existing pool of human-generated data, not substituted for it. Through extensive experiments with language models, image generators, and molecular modeling tools, they demonstrate that this accumulation of data effectively prevents model collapse. Performance remains stable or even improves across successive generations of models trained on the growing, mixed dataset. The paper further supports this finding with a mathematical proof using a simplified linear model, showing that accumulating data bounds the error, preventing it from growing uncontrollably. For psychologists, this suggests that the increasing presence of AI-generated content on the internet may not catastrophically corrupt future AI tools used in research or clinical settings, as long as training datasets continue to incorporate diverse, original human data alongside synthetic content.

Tuesday, October 14, 2025

Ethical principles for regulatory risk decision-making

Bhuller, Y., et al. (2025).
Regulatory Toxicology and Pharmacology, 105813.

Abstract

Risk assessors, managers, and decision-makers are responsible for evaluating diverse human, environmental, and animal health risks. Although the critical elements of risk assessment and management are well-described in national and international documents, the ethical issues involved in risk decision-making have received comparatively little attention to date. To address this aspect, this article elaborates fundamental ethical principles designed to support fair, balanced, and equitable risk-based decision-making practices. Experts and global thinkers in risk, health, regulatory, and animal sciences were convened to share their lived experiences in relation to the intersection between risk science and analysis, regulatory science, and public health. Through a participatory and knowledge translation approach, an integrated risk decision-making model, with ethical principles and considerations, was developed and applied using diverse, contemporary risk decision-making and regulatory contexts. The ten principles - autonomy, minimize harm, maintain respect and trust, adaptability, reduce disparities, holistic, fair and just, open and transparent, stakeholder engagement, and One Health lens - demonstrate how public sector values and moral norms (i.e., ethics) are relevant to risk decision-making. We also hope these principles and considerations stimulate further discussion, debate, and an increased awareness of the application of ethics in identifying, assessing, and managing health risks.

Here are some thoughts:

This article is critically important for psychologists because it explicitly integrates human values, behavior, and social dynamics into the core of regulatory risk decision-making. While framed for risk assessors and policymakers, the article’s ten ethical principles—such as Autonomy, Minimize Harm, Maintain Respect and Trust, Reduce Disparities, and Stakeholder Engagement—are fundamentally psychological and social constructs. Psychologists possess the expertise to understand how these principles operate in practice: how people perceive and process risk information, how trust is built or eroded through communication, how cognitive biases influence judgment under uncertainty, and how social, cultural, and economic disparities affect vulnerability and resilience. The article’s emphasis on “One Health,” which connects human, animal, and environmental well-being, further demands a systems-thinking approach that psychologists are well-equipped to contribute to, particularly in designing interventions, facilitating stakeholder dialogues, and crafting transparent, culturally appropriate risk communications. By providing a formal ethical framework for decision-making, the article creates a vital bridge for psychologists to apply their science in high-stakes, real-world contexts where human welfare, equity, and ethical conduct are paramount.

Monday, October 13, 2025

End-of-Life Decision Making in Multidisciplinary Teams: Ethical Challenges and Solutions–A Systematic Review

Mujayri, H. et al. (2024).
jicrcr.com.

Abstract

Background: To provide high quality end of life (EOL) care, multidisciplinary teams (MDTs) need to be able to proficiently navigate the intricacies of ethical dilemmas faced by EOL care; to maintain an equilibrium between patient autonomy, family involvement and cultural competence. Yet, the lack of cohesive EOL decision making currently continues to occur because of communication barriers, role ambiguity and a lack of sufficient ethics training within MDTs. As a consequence, these issues demonstrate the necessity of having structured protocols to help MDTs make ethically sound decisions in the EOL care.

Aim: The purpose of this paper is to identify and review major ethical factors that affect ethical decision-making in EOL MDTs, and explore the themes of patient autonomy, communication, cultural sensitivity, ethics training, and institutional barriers.

Method: Ten studies were reviewed systematically according to PRISMA criteria using data sources including PubMed, Scopus, Web of Science, and CINAHL databases. The analysis included studies published between the years 2020 and 2024 and the ethical decision–making challenges and solutions that MDTs face in EOL care contributing to those decisions.

Results: Four key themes were identified: Issues concerning balancing patient autonomy with family input, communication challenges in MDTs, cultural sensitivity in EOL care and the necessity of ethics training. Results indicate that MDTs are often faced with ethical dilemmas when patient’s wishes diverge from those of their family and experience communication difficulties that resulted in degradation of care quality. Simulation is an entertaining and effective way to develop cultural awareness and ethics training in EOL care practice.

Conclusion: Ethical challenges in EOL decision making must be addressed with an intervention encompassing improved ethics training, MDT role clarity, culturally aware practice, and institutional support. These strategies, if implemented will support MDTs in providing patient centered and ethically sound EOL care. Further study of ethics training, communication frameworks and cultural competence on EOL decision-making in MDTs is warranted for future research.

Here are some thoughts:

This article is critically important for practicing psychologists because it directly addresses the core ethical, communicative, and interpersonal challenges they face as integral members of multidisciplinary teams (MDTs) in end-of-life (EOL) care. The systematic review identifies key themes—such as balancing patient autonomy with family input, navigating communication breakdowns within teams, and addressing cultural and religious sensitivities—that are central to a psychologist’s role. Psychologists are often the clinicians best equipped to facilitate difficult family meetings, mediate conflicts between patient wishes and family or team concerns, and ensure that care is culturally competent and patient-centered. The article underscores a significant gap in ethics training and recommends simulation-based learning, urging psychologists to seek or advocate for such training to better handle complex moral dilemmas. Furthermore, by highlighting institutional barriers and role ambiguity, it empowers psychologists to push for clearer team protocols and systemic support, ultimately enabling them to contribute more effectively to ethically sound, compassionate, and collaborative EOL decision-making.

Saturday, October 11, 2025

GDPval: Evaluating AI Model Performance on Real-World Economincally Valuable Tasks

OpenAI. (2025).

We introduce GDPval, a benchmark designed to evaluate how well AI models perform economically valuable tasks in real-world settings. GDPval includes the majority of work activities defined by the U.S. Bureau of Labor Statistics for 44 occupations across the nine sectors that contribute most to U.S. GDP. The tasks in GDPval are based on the actual work of industry professionals who average 14 years of experience.

Our findings show that frontier AI models are improving on GDPval at a roughly linear rate over time. The strongest models now produce deliverables that are approaching the quality of work produced by industry experts. We also examine how pairing frontier models with human oversight could allow these tasks to be completed more quickly and at lower cost than by unaided experts.

Model performance improves further when reasoning effort, task context, and structured guidance are increased. To support future research on real-world AI capabilities, we are releasing a gold-standard subset of 220 tasks and providing a public automated grading service at evals.openai.com.

Here is my brief summary:

This paper introduces GDPval, a new benchmark developed by OpenAI to evaluate AI models on real-world, economically valuable tasks that reflect actual knowledge work across 44 occupations and 9 major U.S. GDP sectors. Unlike traditional academic benchmarks, GDPval emphasizes realism, representativeness, and multi-modality, with tasks based on expert-validated work products that take professionals an average of 7 hours to complete. The evaluation uses pairwise comparisons by industry experts to measure AI performance, finding that top models like Claude Opus 4.1 and GPT-5 are approaching human-level performance in some areas—Claude excels in aesthetics and formatting, while GPT-5 leads in accuracy and instruction-following. The authors open-source a 220-task "gold subset," provide an experimental automated grader, and analyze how factors like reasoning effort, prompting, and scaffolding impact model performance, highlighting both the potential and current limitations of AI in professional workflows.

Friday, October 10, 2025

Ethical challenges and evolving strategies in the integration of artificial intelligence into clinical practice

Weiner, E. B.,  et al. (2025).
PLOS digital health, 4(4), e0000810.

Abstract

Artificial intelligence (AI) has rapidly transformed various sectors, including healthcare, where it holds the potential to transform clinical practice and improve patient outcomes. However, its integration into medical settings brings significant ethical challenges that need careful consideration. This paper examines the current state of AI in healthcare, focusing on five critical ethical concerns: justice and fairness, transparency, patient consent and confidentiality, accountability, and patient-centered and equitable care. These concerns are particularly pressing as AI systems can perpetuate or even exacerbate existing biases, often resulting from non-representative datasets and opaque model development processes. The paper explores how bias, lack of transparency, and challenges in maintaining patient trust can undermine the effectiveness and fairness of AI applications in healthcare. In addition, we review existing frameworks for the regulation and deployment of AI, identifying gaps that limit the widespread adoption of these systems in a just and equitable manner. Our analysis provides recommendations to address these ethical challenges, emphasizing the need for fairness in algorithm design, transparency in model decision-making, and patient-centered approaches to consent and data privacy. By highlighting the importance of continuous ethical scrutiny and collaboration between AI developers, clinicians, and ethicists, we outline pathways for achieving more responsible and inclusive AI implementation in healthcare. These strategies, if adopted, could enhance both the clinical value of AI and the trustworthiness of AI systems among patients and healthcare professionals, ensuring that these technologies serve all populations equitably.

Here are some thoughts:

This article is important for psychologists because it highlights the critical ethical challenges surrounding patient trust, consent, and human-AI interaction in clinical settings—areas central to psychological practice. It details how patient demographics influence trust in AI and emphasizes the need for empathetic, transparent communication from AI systems to address patient anxieties and perceptions of "uniqueness neglect." Furthermore, it discusses "automation bias," where clinicians may overly rely on AI, a phenomenon psychologists must understand to support ethical decision-making and preserve the human-centered, therapeutic aspects of care.

Thursday, October 9, 2025

Turn it and Turn it Again: The Updated Inclusive Model of Ethical Decision Making

McAuliffe, D., & Greenslade, L. (2025).
Ethics and Social Welfare, 1–13.

Abstract

Ethical decision making is a critical skill for practitioners of all disciplines in the social, health and human services. Having capacity to engage proactively with decisions that will impact people’s lives in a way that is rigorous, principled, and considered, is the hallmark of an ethically competent practitioner. There have been multiple models of ethical decision making that have provided structured examples of the questions that should be asked of self and others while navigating an ethical dilemma. The Inclusive Model of ethical decision-making was first published by McAuliffe & Chenoweth in this journal in 2008. In reviewing the Inclusive model some 15 years since its original development, it is timely to reconsider the value of incorporating a 5th ethical platform, conceptualised as Interdependence, to draw on the importance of the relationships between humans, non-humans, and the natural world. This paper provides an extension of previous work to bring the Inclusive model of ethical decision making to a better coherence with current developments in both theory and practice.

Here are some thoughts:

This article presents an updated, practical ethical decision-making model that explicitly incorporates "Interdependence," urging practitioners to consider the impact of their decisions on relationships, non-human animals, and the environment—areas increasingly relevant to holistic client care. The model’s structured, five-step process (defining the dilemma, mapping legitimacy, gathering information, considering alternatives, and critical evaluation) provides a clear, systematic framework for navigating complex real-world dilemmas, which is invaluable in clinical practice. Furthermore, its emphasis on consultation, cultural sensitivity, and critical reflection aligns with core psychological competencies, making it a versatile tool for individual practitioners and interdisciplinary teams.

Wednesday, October 8, 2025

Six Fallacies in Substituting Large Language Models for Human Participants

Lin, Z. (2025).
Advances in Methods and Practices in
Psychological Science, 8(3).

Abstract

Can artificial-intelligence (AI) systems, such as large language models (LLMs), replace human participants in behavioral and psychological research? Here, I critically evaluate the replacement perspective and identify six interpretive fallacies that undermine its validity. These fallacies are (a) equating token prediction with human intelligence, (b) treating LLMs as the average human, (c) interpreting alignment as explanation, (d) anthropomorphizing AI systems, (e) essentializing identities, and (f) substituting model data for human evidence. Each fallacy represents a potential misunderstanding about what LLMs are and what they can tell researchers about human cognition. In the analysis, I distinguish levels of similarity between LLMs and humans, particularly functional equivalence (outputs) versus mechanistic equivalence (processes), while highlighting both technical limitations (addressable through engineering) and conceptual limitations (arising from fundamental differences between statistical and biological intelligence). For each fallacy, specific safeguards are provided to guide responsible research practices. Ultimately, the analysis supports conceptualizing LLMs as pragmatic simulation tools—useful for role-play, rapid hypothesis testing, and computational modeling (provided their outputs are validated against human data)—rather than as replacements for human participants. This framework enables researchers to leverage language models productively while respecting the fundamental differences between machine intelligence and human thought.

Here are some thoughts:

This article critically examines the growing trend of using Large Language Models (LLMs) as direct substitutes for human participants in psychological and behavioral research. While acknowledging that LLMs can generate human-like text and sometimes mirror average human responses, Lin argues that this "replacement perspective" is fundamentally flawed and identifies six key interpretive fallacies that undermine its validity. These fallacies are: equating statistical token prediction with genuine human intelligence; assuming LLM outputs represent an "average human"; interpreting alignment between model and human outputs as evidence of shared cognitive mechanisms; anthropomorphizing AI systems by attributing human mental states to them; essentializing social identities by treating demographic labels as fixed and homogeneous; and directly substituting model-generated data for human evidence without validation. Lin contends that LLMs should be viewed not as replacements, but as pragmatic simulation tools useful for tasks like rapid hypothesis testing, role-playing, and computational modeling—provided their outputs are always validated against real human data. The article emphasizes the fundamental, often conceptual, differences between statistical machine intelligence and biologically grounded, embodied human cognition.

Tuesday, October 7, 2025

How a new mental-health app is helping patients reality-check their hallucinations

Chris Hannay
The Toronto Globe and Mail
Originally published 21 AUG 25

As new digital tools powered by Al raise fears of misinformation, a Canadian startup has gone the other way: Using technology to help patients with severe mental-health illnessesperform reality checks of their hallucinations.

The digital health app, called A4i (which stands for "App for Independence"), was created by software developer Amos Adler and Sean Kidd, a senior scientist at the Centre for Addiction and Mental Health. The company was spun out of CAMH and is now being adopted by some mental-health hospitalsin Canada and the U.S., including the Waypoint Centre for Mental Health Care in Ontario and the Riverside University Health System in Southern California.

The hallmark feature is an auditory hallucination detector, for which the company got a patent in 2023. A patient can use the app to record sounds around them and, by answering prompts, help sort out whether what they are hearing is real or imagined.

Dr. Kidd said the inspiration for the feature came from a patient. The young man had schizophrenia and was experiencing persistent, distressing auditory hallucinations. He'd bring audio recordings taken in his apartment to sessions and ask Dr. Kidd if he could hear sounds such as voices or yelling. Dr. Kidd usually couldn't.

That led the psychologist to look into what phone-based tools might be available for such patients - he couldn't find any.



This is not an endorsement, but for educational purposes only.

Monday, October 6, 2025

DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks

Wan, H., Yang, C. et al. (2025)
arXiv.org

Abstract

Deep research agents have attracted growing attention for their potential to orchestrate multi-stage research workflows, spanning literature synthesis, methodological design, and empirical verification. Despite these strides, evaluating their research capability faithfully is rather challenging due to the difficulty of collecting frontier research questions that genuinely capture researchers' attention and intellectual curiosity. To address this gap, we introduce DeepResearch Arena, a benchmark grounded in academic seminars that capture rich expert discourse and interaction, better reflecting real-world research environments and reducing the risk of data leakage. To automatically construct DeepResearch Arena, we propose a Multi-Agent Hierarchical Task Generation (MAHTG) system that extracts research-worthy inspirations from seminar transcripts. The MAHTG system further translates research-worthy inspirations into high-quality research tasks, ensuring the traceability of research task formulation while filtering noise. With the MAHTG system, we curate DeepResearch Arena with over 10,000 high-quality research tasks from over 200 academic seminars, spanning 12 disciplines, such as literature, history, and science. Our extensive evaluation shows that DeepResearch Arena presents substantial challenges for current state-of-the-art agents, with clear performance gaps observed across different models.

My thoughts: In essence, this paper is important to psychologists because it tackles the evaluation of AI on tasks that closely mirror the complex, ill-defined, and creative nature of human scientific inquiry. It provides both a new tool for assessing AI (which will increasingly interact with human researchers) and a novel methodological framework that could be adapted to study human cognition itself.