Journal article
Ethics and Information Technology, vol. 27(28), 2025
APA
Click to copy
Lindström, A. D., Methnani, L., Krause, L., Ericson, P., de Rituerto de Troya, Í. M., Mollo, D. C., & Dobbe, R. (2025). Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback. Ethics and Information Technology, 27(28). https://doi.org/10.1007/s10676-025-09837-2
Chicago/Turabian
Click to copy
Lindström, Adam Dahlgren, Leila Methnani, Lea Krause, Petter Ericson, Íñigo Martínez de Rituerto de Troya, Dimitri Coelho Mollo, and Roel Dobbe. “Helpful, Harmless, Honest? Sociotechnical Limits of AI Alignment and Safety through Reinforcement Learning from Human Feedback.” Ethics and Information Technology 27, no. 28 (2025).
MLA
Click to copy
Lindström, Adam Dahlgren, et al. “Helpful, Harmless, Honest? Sociotechnical Limits of AI Alignment and Safety through Reinforcement Learning from Human Feedback.” Ethics and Information Technology, vol. 27, no. 28, 2025, doi:10.1007/s10676-025-09837-2.
BibTeX Click to copy
@article{adam2025a,
title = {Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback},
year = {2025},
issue = {28},
journal = {Ethics and Information Technology},
volume = {27},
doi = {10.1007/s10676-025-09837-2},
author = {Lindström, Adam Dahlgren and Methnani, Leila and Krause, Lea and Ericson, Petter and de Rituerto de Troya, Íñigo Martínez and Mollo, Dimitri Coelho and Dobbe, Roel},
howpublished = {}
}