Cornell University highlights a working paper that a number of collaborators and I have been working on to better understand how AI chatbots might complement decsion making or potentially make things go awry, such as reinforcing blindspots that we generally have as humans.
The current working paper can be found here on SSRN.
A few weeks ago, we covered ethics in my behavioral economics class at Cornell Tech. The case example below strikes me as tipflation + pure deception, which involve ethical issues stacked on top one another and put the end consumer in a terrible place (e.g., stating tip as 25% but providing an actual dollar tip amount that is even larger, say 38%, under the guise that it is actually 25%). First of all, the consumer has to determine what is fair and deserving to leave as a tip, and that is complicated in of itself because they often can’t judge how tips are split among the restaurant operations staff and other. Secondly, both tipflation and deception nudges likely prey on fast thinking psychological processes and may disproportionately affect those with lower numeracy (and possibly socioeconomic status). The nudge to “check your math” is likely moving in the right direction, but it takes reflective, slow thinking and a certain level of math skills and cognitive stamina (which could be additionally challenging if someone is cognitively depleted after a meal).
To recap some items we discussed in class, these include:
Goal alignment between the nudger and nudgee
Degree of control and influence of the nudge (e.g., to what extent a nudge invokes fast System 1 automatic thinking versus slower System 2 reflective thinking)
Fairness considerations (e.g., moral foundation theory or organizational justice principles, such as procedural justice)
Heterogeneous treatment effects (e.g., negative effects on those with lower socioeconomic status, numeracy, cognitive stress or depletion)
00:15 Exploring the democratization of nudges to enhance organizational awareness and accessibility of behavioral science, shedding light on various models and ethical considerations.
05:02 Initiating a behavioral finance institute and delving into the intersection of psychology and economics, highlighting the importance of understanding behavioral economics in navigating financial decision-making.
09:26 Analyzing the multifaceted aspects of retirement planning, including decomposing the problem, aligning goals, and acknowledging uncertain outcomes, while tracing the emergence of behavioral science from foundational work to its current application in various sectors.
14:04 The expansion of behavioral decision-making groups in academic institutions has led to increased labor in the market and the emergence of boutique consultancies, advocating for the incorporation of behavioral economics principles across various business sectors, suggesting a gradual implementation approach starting with anchor areas to foster organizational learning and maximize effectiveness.
18:41 Addressing retirement preparation as a marathon with potential hazards, emphasizing the importance of simplifying choices, enhancing financial literacy, and reframing savings concepts, while advocating for pension system adaptability to accommodate evolving work dynamics and longevity.
23:27 Advocating for a balanced approach in encouraging smarter savings behaviors, addressing the diverse perspectives on longevity and health, advocating for increased research and collaboration, and fostering leadership that prioritizes sustainability and inclusivity in pension systems.
27:56 AI, like ChatGPT, presents opportunities for automating tasks but requires human oversight to mitigate biases, particularly in decision-making processes where AI may inherit similar biases to humans, highlighting the importance of careful framing and consideration of alternative explanations.
32:39 AI platforms exhibit strengths and weaknesses, offering insights into when to integrate them into decision-making processes while also emphasizing the importance of democratizing access, raising awareness, and simplifying usability to ensure broader adoption and equitable benefits for all.
As a follow-on post to summer 2023 exploratory work that is happening with the Behavioral Economics Research and Education (BERE) Lab, we’ve started to compile early results. Here are some test result summaries of different AI platforms based on the conjunction fallacy test (Linda problem). Note that platforms vary based on degree of live access to the internet and incorporation of slower System 2 thinking influences (although these characteristics are also confounded with platform implementation). Here we test ChatGPT 3.5, Bing Chat AI (based on GPT 4), and Google Bard.
Interesting questions to reflect on: – How do AI platforms differ? – Which gets things right? – Which do you trust? – To what extent will AI adoption get impacted by use case, accuracy, and trust?
For the past two summers, I have personally volunteered to spend time helping a limited number of students pursue interests in behavioral economics and build their resume of experiences. For this summer, I will expand my efforts somewhat, although I hope to eventually find a more sustainable and scalable model in terms of funding, operations, and potential synergies with other organizations.
Here’s the natural extension of what I’ll be doing for the summer of 2023.
The Behavioral Economics Research and Education (BERE) Lab by Stephen Shu is an effort geared toward helping college students and young professionals with either the empirical or applied practice of behavioral economics. BERE efforts are in support of open science and the advancement of education. Where possible, students or young graduates may be supported by grants, and BERE welcomes opportunities to help students and young graduates obtain grants or corporate sponsorship.
Students and young graduates may pursue exploratory replication studies, expansion research studies, and corporate-focused research (for use in educational settings). Where possible, students are encouraged to develop empirical or professional skills (e.g., R or Stata statistical software, Python programming, communications, writing).
The research theme for 2023 includes exploratory work around the intersection of generative artificial intelligence (AI) and behavioral economics, such as similarities and differences between AI and human decision making across different platforms.
TL;DR: This episode of “The Tao of Chao” podcast features Dr. Stephen Shu, a specialist in behavioral economics. The discussion explores how our decision-making processes are influenced by biases and cognitive frameworks rooted in our primal survival instincts. With the increasing volume of information and opinions available, our brains struggle to process it all, leading to echo chambers and confirmation biases. The conversation highlights the importance of recognizing our fast and slow thinking capabilities and encourages reflective thinking to counteract these biases and make better decisions in an ever-changing world.
TL;DR: A thinking tool called “prospective hindsight” can be used to explore different outcomes by imagining a future event and examining the steps that led to it. Avoidance in decision-making can stem from complexity, trade-offs, or a reluctance to consider negative outcomes. While it is impossible to predict the future accurately, a detailed planning process that considers both logic and emotions can help make more informed decisions. In investing, scenario planning and understanding the transmission mechanisms of events can improve decision-making, but it is essential to be aware of biases and actively seek counterfactual information. The abundance of data does not guarantee better decision-making, and the importance of information depends on the context and the significance of the decision.
A student recently asked me whether design thinking was different from behavioral economics thinking. In a nutshell, I believe the disciplines complement one another and should not be viewed as separate islands. That said, insights from behavioral economics and psychology can help us to become more thoughtful designers of products, customer experiences, etc.
One important behavioral area to consider is the role of memory in judgment and decision-making processes. Last week at Cornell Tech I facilitated a brief discussion of a new startup that was trying to address the issue of helping people to have perfect memory (as opposed to ever forgetting things). To what extent is this the perfect idea? What are some considerations from behavioral science?
To help feed that discussion, I relayed some results from a study by Eric Johnson and colleagues that provides substantiation for one theoretical role of memory in the decision-making process. Their study was conducted in the context of people valuing a commodity item, a coffee mug. In the classic example of the “endowment effect”, which is that people value things more when they possess an item (e.g., are given or endowed with a mug), people endowed with a mug, valued mugs at $6.01. People who were not endowed with a mug, valued mugs less at $3.72.
However, things got more interesting when the researchers manipulated the natural, unguided memory retrieval process. They manipulated the process by reversing the order in which people thought about things. Before having people value the mug, they essentially asked sellers to think about negative aspects of the mug, and they asked buyers to think about positive aspects of the mug. The endowment effect essentially vanished with sellers now valuing the mug at $5.05 and buyers valuing the mug at $4.91.
So memory retrieval order matters. If we go back to the startup example, and we have artificial intelligence (AI) based products remembering stuff so that we can make decisions, how should designers determine what to present to us first, knowing that presentation order may influence our decisions? Could presenting too many memories cause decision paralysis? If one is to pursue a product like the one described, the design choices are not trivial.
The bottom line is that hopefully we can improve our design reasoning by remembering to factor in insights from behavioral science.
Reference: Johnson, E.J., Häubl, G. and Keinan, A., 2007. Aspects of endowment: a query theory of value construction. Journal of experimental psychology: Learning, memory, and cognition, 33(3), p.461.
Let me first preface this post by saying two things: 1) the state of research in this area is very young (e.g., most citations in 2022 and 2023), and 2) my summary will be at risk of oversimplifying things and missing some nuance.
The students I have at Cornell Tech are really sharp and energized. In one class, an interesting question was raised about whether AI could be used as part of the testing process, such as to A/B pre-test interventions. To try to get a better understanding of this space, I sought to do a little research on to what extent AI decision making resembles human decision making. So today I shared findings from a working paper that I recently read (Chen et al., 2023). The paper covers 18 common human biases relevant to operational decision-making (e.g., judgments regarding risk, evaluation of outcomes, and heuristics in decision making, such as System 1 versus System 2 thinking).
Here’s a summary of differences between ChatGPT and humans:
Judgments Regarding Risk – ChatGPT seems to mostly maximize expected payoffs with risk aversion only demonstrated when expected payoffs equal. It does not understand ambiguity. Also, ChatGPT exhibits high overconfidence, perhaps due to its large knowledge base.
Evaluation of Outcomes – ChatGPT is sensitive to framing, reference points, and salience of information. No sensitivity to sunk costs or endowment effect (e.g., may not have physical or psychological ownership concept).
Heuristics in Decision Making – More research needed, although aspects such as confirmation bias present. Additionally, ChatGPT has the ability to generate both classic System 1 responses (incorrect answers by humans typically driven by fast, automatic thinking) and System 2 responses (correct answers, such as those by humans which typically require more slow, reflective thinking).
While the reasoning for these modes of responses is not fully known, it seems as though ChatGPT is extremely logical when it comes to things like maximizing expected value. However, perhaps due to its nature of trying to be conversational and responding to salient information provided by the user, it can be overly sensitive to framing effects.
There are surely a lot things to think about, opportunities to pursue, and research to pursue.