Saturday, April 27

Deep Dive: ChatGPT, Part 1



How should students interact with ChatGPT, and how can UCLA be a responsible leader in artificial intelligence innovation? In this two-part mini series about ChatGPT, Podcasts contributors Phoebe Brous, Kaitlyn Esperon and Sonia Wong seek to answer these questions and more. In this episode, Phoebe sits down with UCLA computer science assistant professor Violet Peng and UCLA information studies professor Ramesh Srinivasan.

Phoebe Brous: Hey y’all. My name is Phoebe Brous and I’m a news reporter, opinion writer and podcast contributor for the Daily Bruin, and welcome to Deep Dive, a Daily Bruin explanatory podcast that investigates national, UC-wide and campuswide events affecting Bruins. In this two part series we explore the emergence and proliferation of ChatGPT, a natural language processing tool that functions as a search engine and facilitates human-like conversation with the user. When my friend introduced me to ChatGPT last October, I remember feeling sort of uneasy. It seemed like this new Terminator-esque technology was a little too good to be true, like the machine would evolve so rapidly, it might reach through the computer screen and possess me. Now I use it for everything: improving my writing, helping me with coding assignments, or answering basic questions. You might use it for the same reasons. It’s obvious that through advanced capabilities and potential impact on the classroom, ChatGPT and other AI chatbots will tremendously affect the education system. In March 2023, the UCLA Academic Senate announced that the use of ChatGPT or similar technologies is prohibited unless otherwise endorsed by the instructor. Aside from analyzing the immediate impacts of ChatGPT on the higher education system, this series also aims to illustrate its broader societal implications. Today we hear from two UCLA experts. First, Violet Pang is an assistant professor of computer science at UCLA. In our interview, we discussed the basic functions of GPT and how it expands upon previous innovation and natural language processing. Then I speak with Ramesh Srinivasan, a UCLA digital humanities professor who focuses on these technologies’ impacts on people, markets, culture and politics. He explains his hesitancy and concerns around ChatGPT, an AI innovation more broadly. As a note, these interviews were conducted April 2023. Enjoy.

Violet Peng: Yeah, so I’m a systems professor at the computer science department working on natural language processing. And I actually work on two broad areas known as natural language understanding and natural language generation.

PB: And so for students who aren’t familiar with NLP, do you mind describing what that means and how it relates to ChatGPT?

VP: So NLP is a field where we teach a computer to understand human language and natural language. And usually we work with text sequences, and it’s related to ChatGPT because ChatGPT is basically a “LG model”, a natural language generation model. So you’re talking to them and then the model will generate human language so that you will feel that, “Oh, I can have a natural conversation,” and then I can get something out of it.

PB: And in the context of ChatGPT, I feel like, for folks that don’t have a CS background, it seems like the emergence of this technology is kind of new and scary and out of the blue. How, in many ways, has this technology been a product of research that’s happened in the past? And could you situate us there in that context?

VP: Definitely. So ChatGPT has several stages of training, several stages of building, right? The very basic stage and I still think it’s the most powerful stage, is called language modeling, which has been there for more than 20 years in the research community. And language modeling is a very intuitive task. We can have some interaction, right? The task is just to say, “I give you dinner, I ate blank,” right? What should we fill in that blank?

PB: Any food. Like spaghetti?

VP: Okay, then if we make a little bit of a contrast, “For lunch, or for breakfast I had..”

PB: Oatmeal.

VP: So in other words, although you can have any type of food in both contexts, there is also this cultural background. First you have oatmeal, eggs, milk right – and then for dinner it’s more likely to be spaghetti and then whatever, and then for lunch it’s probably more likely to be salad and sandwich and all that, right? Then I give another context: “As a vegetarian for dinner I had… Despite being a vegetarian for dinner I had…” So the language modeling just gives a machine this task. For all these tests, I give you a context – actually machines are required to do it at every step. So it’s like from full, and then for what for dinner, for lunch, for breakfast, and I ate… and then at every step the machine is tasked to predict what’s next. And then it’s very powerful because we have a lot of resources, which are the internet sources – all the internet recordings of humans doing questions answering or having Wikipedia and having all these discussion forums. And then the machine is just reading all these things, trying to predict what will be the next word, given all the previous context. It’s been there, this task and this technique has been there for more than 20 years, but what is different now is we have much stronger computation – and then we have also better models for the machine or the like basic machine learning models to capture the nuances in the context. So then with all the power of computation and better modeling, then those like ChatGPT and GP4 have been trained on all possible data that you can think of, the data dump on the web on Wikipedia, and they read all that… and then they’ve been trained to predict the language. And then from there, there are additional techniques to say… to try to let this model follow instructions when I say “write a poem,” because it has this basic understanding by reading all the internet, have a basic understanding of language, we still need to align the model with: “When I say a poem, what does it mean,” “When I say I need the summary, what does it mean?”, “When I ask a question, what does it mean,” … and then have a little bit of additional training on this instruction alignment, then that’s pretty much what we see now as ChatGPT.

PB: What distinguishes ChatGPT from a traditional search engine is the focus on language and also the fact that there’s a singular output, or how do you view that?

VP: Yes, so for search engines actually you need to do the heavy lifting, right? You give keywords. Search engines are mostly just keyword matching. It will first try to extract the keywords from it and then keyword matching gives you a lot of outputs, and then you try to choose what’s relevant. But then for this language modeling, it’s sort of readable and you can view it as memorizing the whole internet. It will spew out when you give your query or question… it just thinks about what’s the best next word to spill out, given your question, and then that will be your answer. On the other hand, there are issues people have been discussing about that. It’s not always guaranteed that the results are accurate, because it’s a memorization and then spills out. It’s not the exact thing that you can find on the internet by matching.

PB: And in the context of higher education, how have you seen and how do you predict students will use this technology in their classes?

VP: Yeah, for me personally. … First, for my class we don’t have a lot of essay writing… So I felt it’s less, probably less useful – although, I know they are also good at writing code, but then our code is also more advanced … I feel it’s less likely that they can use it. On the other hand, we also don’t have adequate mechanisms to really detect that, so I cannot say … or my impression, or my understanding is in my class I haven’t seen too much of that… But speaking of that, I felt like personally, because I’m a second language speaker, ESL … I sometimes use ChatGPT to edit or polish my emails, for example … to make it more smooth. And I do find it’s actually quite useful in those scenarios.

PB: Yeah, I agree. In light of UCLA’s recent announcement, do you have some sort of opinion about the way that the administration is dealing with this or maybe something in the future that needs to happen? Something they should consider?

VP: Yeah, right. I mean, this is a complex matter, right? I do see it will be more problematic for people to really use this. I say it’s like in the worst case, the consequences can be pretty severe in the sense that students no longer put the hard work in, and they also don’t even check the output and then it can be all erroneous… and then just they use this tool and then submit whatever they are tasked on doing – because along with the homework or writing, It’s actually a learning process, right? What we care about is not really only the end product, but also in order to get that end product the student needs to do all the work … reading and all that to be able to complete some task … but on the other hand, I also think, for example, in the scenario that I’m mentioning for ESL people to be able to write more fluently, and more like a native speaker in the sense that, the content is still yours, it’s really about the language, right? I feel it can be actually a good usage, and it’s like an advanced Grammarly type of thing. Yeah. So I think in that sense, I actually think we should allow the usage. So again, it’s a little complex and I think my hope is more if there can be more detailed guidelines. … We shouldn’t say we will allow people to use that freely, right? Like, however you want but then probably also not completely ban it, but give a good guideline about in what scenarios we think it’s helpful, or we think it’s totally fine using and in what scenario you shouldn’t use it, and also give a little bit of a reasoning about why so that student also establishes the understanding on why, and then hopefully, they will be on board and they will also understand. Also have the generalizability of this that it is not allowed. … Another thing, maybe it’s not exactly right, but it’s the same spirit, and it’s also not allowed, … that type of thing.

PB: And more broadly, how do you predict these models will continue to evolve, especially in the context of GPT-4, which is remarkably better than the previous version.

VP: I think it will continue to evolve. I even saw a rumor, but it’s purely rumor, saying GPT-5 will come in July. I checked very soon… I think there are several things that the community has been trying to improve this model. One is the multi-modality ability… because for now, it’s mostly just text, right? But we know, language is a synthetic symbol system – it’s actually used to describe our experience, and it’s grounded to this experience, or the visual cue and a lot of these other modalities… whereas the original model without all those visual cues is a little inadequate. Like GPT-4 one of the features is like, they start to understand images and I think they will move to understand videos as well…and then from video understanding temporal relations between events, understand more these spatial relations and common sense knowledge. And another thing is they’re trying to build models to predict the model’s performance – like before we build a new iteration of the model, based on what we already know, there is this learning curve … Can we know the next iteration? … Either with more data or more parameters or whatever. … What will the performance look like for certain tasks? … And then based on that they are trying to improve or like to use that as a guidance for them to build a stronger model. So I do see it will continue to evolve. Yeah, especially with the introduction of additional modalities that will help them out or have more knowledge that may not be in the written language.

PB: That was Violent Peng. Next we have Professor Srinivasan. We start by discussing ChatGPT’s direct impact on higher education and end with his views on how UCLA can lead in responsible usage of artificial intelligence.

As a faculty member and an expert on technology’s intersection with the political, the economic, the social, I just kind of want to hear your initial impressions or thoughts about how ChatGPT and ChatGPT-4 are going to impact the education system, long term and short term?

Ramesh Srinivasan: Well, in a certain sense, ChatGPT is not actually that different than the ways lots of other so called “big data technology platforms” operate, right? In the sense that what it does is it correlates a query you make to it to bodies of data that are being gathered. And so that’s the same thing in a sort of different format. It’s similar in its functioning to the ways Google search engine functions, or other sort of also social media feeds and algorithms function, except the question of what determines what is selected to be fed back to you could be based upon a variety of different design principles, that, you know, we know in the case of most social media feeds, it’s not merely correlating your data or your query in the GPT case to what comes back to you, … it’s also selecting what comes back to you based on what’s most likely to you or outrage you, or essentially grab your attention, right? So it’s not totally clear if GPT is sort of guiding heuristics or parameters behind its algorithms function in the same way whether the content that’s coming back to us is intended to be attention grabbing. But it’s worth noting that what comes back to us with almost any big data tech platform is based on patterns of data that are being gathered about us. So now tying this to education and UCLA in particular, I think is a very big issue because what this technology does is essentially allows you to author and create content about almost anything based on these patterns of data. So that allows it – we do have histories of chatbots, including ones that were involving Microsoft. This is not the first chatbot. It’s just the most complex one, with the most sophisticated ability to act human-like and to gather enough data to be in a sort of ongoing way continuously conversational. So it’s a huge issue because essentially you can have ChatGPT do the kinds of things that we would want our students to do, in terms of being critical or reflective or analytical, but without them actually doing any of those things. So in essence, what it does is it mimics human behavior, but it accomplishes none of the things that we actually want out of education, which is being imaginative, which is being critical, which is being analytical, which is being creative. And this technology is just sort of going to mimic, generally speaking, mass cultural patterns around a given query. It’s a huge issue for us not because the technology is actually particularly spectacular – I actually don’t think the technology is necessarily that interesting – I think the way in which the technology is asserted to so called disrupt, or I might use the word threaten, many traditional activities that are important to us as human beings, I think that’s the main problem. The main problem is how it’s going to affect workers, and how it’s going to affect the ability of what we do best in a top university like UCLA, which is encourage and get our students to be creative and reflective and to show leadership. But now if you have a tool that can basically pose for you, you’re going to actually not gain any of those skills that actually are critical. And there’s a lot of evidence that a lot of robotic and automated systems, especially in the area of care, caregiving, caretaking, that humans again and again reject – they don’t want automated robotic systems to do caretaking. So all those real attributes of what make us human, which is creativity and care, are being sacrificed by this technology. So obviously, I’m very, very critical of taking this too far. That said, it does have functions it can provide. It might be an interesting tool for us to work with, as long as we actually have a sense of economic justice, or at least balance in our world – actually have an educational institution that’s actually educational. So education isn’t me dumping information onto you, it’s us together, creating and ideating and reflecting. It’s about the analytical and the creative sides of learning. Learning by doing, learning by trying, learning by making mistakes, this thing can allow people to not do any of those difficult yet critically important activities to being in a top university or just in education itself. It sells our students short, or any of our students who use this – you’re actually not getting what you could otherwise get out of your university experience. I don’t like judging people for using anything. But I think that, in a sense, you’re wasting your money and your time at UCLA, if you overly rely on this. I mean, maybe you can mail it in and get a degree before we’re able to sort of regulate this sufficiently, like even on a campus level. But you’re selling yourself short, you’re doing yourself a disservice, you’re kind of doing everybody a disservice if you consider this tool, which is a total black box, to become a sort of representation of you.

PB: Yeah, and I think that relates well to your humanistic approach to technology and this optimism you have that we can innovate and be creative. And I’m curious, in the context of the job market that’s going to shift and the education system, what new skills or practices need to be prioritized.

RS: The key for work for the future, and workers of the future – more important than work are people, workers – or a sense of economic dignity in a world where technologies are threatening, making precarious, I would say economic security. That’s not the fault of technology. It’s because those technologies are being introduced and driven and monetized by private corporations that in many cases are not even profitable, which is crazy. They’re worth, in some cases, they’re worth hundreds of billions of dollars, like an Uber, they’re not even profitable. What we have to do is arm our students, and we also ourselves as faculty, have to be on top of being able to work with the technology. Knowing how it functions, looking at it, deconstructing it, pulling it apart. So that’s why it’s absolutely critical that we arm our students to understand what we call the affordances of different technologies, right? So like, what are the values by which that technology is created? What are the political and economic models that drive that technology? What are the histories associated with that technology, or the organization behind it? What are the design principles and engineering principles and heuristics that are used in the design of that technology? Those are the kind of conceptual and analytical skills that can allow our students now and in the future, to actually develop new versions of products like GPT, that might be even more supportive of human creativity and care. That’s possible. The issue though is, again and again and again, we’re not taught creative, social science-oriented skills because – and I don’t blame anyone in any university, including our own, on that – but that’s also because now any discussion of technology design and engineering is no longer only about engineering or technology. Technology is the central mediating force in people’s life chances around the world, digital, corporate, big data, algorithmic technology. It’s critical that our future education initiatives are hyper interdisciplinary and systemic oriented. You’re learning about macro economic externalities, while you’re learning about computing heuristics. You have to do both at the same time, but it’s very difficult in large universities. The bureaucracy and the inertia makes it very difficult with old school institutions like our own. And you know, you can see another version of it in a very much more dysfunctional form with the United States Congress.

PB: From an administrative level, UCLA said yesterday that ChatGPT violates the Student Code of Conduct. I’m interested in your reaction to that, and maybe other mechanisms for the UCLA administration to adapt or, not police but regulate this technology.

RS: What I’m trying to do is have my students – I try to give them a range of different assignments where they really can’t or wouldn’t want to use GPT, so we can balance it out. I try to teach about what is problematic about not just this technology, but other technologies. Not because I’m like some skeptic or cynic, but I want us to be analytical about this all. I totally agree that policing something never works. It’s more – but I think we need to figure out ways to convince students and the wider community that we shouldn’t just bow to this technology. We got to – instead what we could do is try to articulate the types of uses of this technology that could be very productive to a learning experience, which may not be that many, but it could be a few. So, for example, if I’m an anthropologist, and I want to play with different types of software code, I don’t know why but say I want to do that, and I’m not trained in building software code, I could try testing out different types of versions of software that GPT would create. And in that case, look at GPT as a text, like cultural texts. The problem though with doing that, and even what I said earlier, is that they don’t publish their source code. They don’t publish much on their learning models. Note that the company is called Open AI, but it’s not open in any way at all.

PB: Closing, I want to more broadly discuss your lens of optimism and democratizing technology, creating technology that serves the public interests, and I think in the context of ChatGPT, there’s a lot of panic and fear around, and cynicism, around the capabilities of this technology. Curious if you could give us advice or a framework for being more optimistic or promoting creativity in this realm in this world.

RS: My optimism comes from being human, rather than being a machine. My optimism comes from my times and experiences working all around the world and seeing the incredibly creative things people can do to make technologies work for them, hack existing technologies, design technologies according to different types of value systems, and seeing technologies as little more than expressive of our creativity, but also our our values as human beings, are diversities. Now, of course, it’s very easy to get into two types of branding that are very dominant today. And they both sell just like the way social media algorithms function. Trauma branding is a big thing these days, which is sort of like “it’s all awful, there’s nothing we can do, it’s all black boxes.” I agree with the critiques underlying it. I don’t agree with the conclusion because the conclusion leaves us completely powerless, which is absurd. If I have the privileged position of being a university professor, full professor at one of the top universities in the world, who is a scholar of these issues and a former AI developer, I shouldn’t just be peddling trauma branding. That to me is irresponsible of me, especially when I’m in a public university. So that’s point 1. Point 2, even more annoying and more awful, is when the so-called leading people who are discussing technology decide to turn it into the Terminator complex, or the matrix complex. That leaves us even more powerless. It sells really well to talk about how this technology is going to become Robocop or Terminator, or you choose Black Mirror-type thing. But that’s our choice to take it there or not, and there are many steps that need to be taken before that could even occur. And the idea of it becoming some sort of superspecies basically takes – what it does is actually shields us from the real life issues associated with this technology. The absence of transparency, the absence of public audit and input, the economics of this. All these technologies leverage an internet that US taxpayers paid for, which started here at Boelter Hall, right at UCLA. So, you know what, it’s not fair for me to – and you’ll notice I’m not trying to vilify any of the engineers or CEOs or any of the people in these companies. I’m not interested in that. I think it’s absurd and irresponsible for us to just sort of assume that we’re all screwed, excuse my language for that, and left in the dark and unable to do anything about it. I think that’s just totally absurd and irresponsible. I want us to double down on life, I want us to double down on what it means to be human. I want us to double down on what it means to be alive. So it’s like, alright, UCLA community, you’re a part of the most well known, or I think now number one public university in the country, we’re also now I think a top 10 university or around there in the world. That’s a brand, that’s a status. We have to leverage that to lead in this area. We can’t just be either catastrophizing or just being like, “Oh, don’t use it, it’s bad.” That’s not good enough. You know, here’s a huge opportunity for public institutions – what’s left of them, you know, libraries, universities, things that are not just about speculative capital and capital valuation – here’s an opportunity for us to no longer have to just like bow to that and be like, “Hey, here’s a whole other set of alternatives that we believe in. This is what technologies of all types could look like from that perspective.”

PB: I really enjoyed working on this episode, and two things stuck out to me. One, GPT is not necessarily new. It’s a product of language modeling science that has existed for more than 20 years. Secondly, with all new technological innovations, you must approach them skeptically and rationally. You must be conscious about them without being scared of them. If innovation is inevitable, it’s just a question of how we adapt and use responsibly in the future. Next episode, we’ll talk to UCLA community members with backgrounds in philosophy, English, design media arts and political science. You can listen to that episode and other Daily Bruin podcasts on Spotify, Apple Podcasts and SoundCloud, and the transcript for this show is available at dailybruin.com. Thanks, everyone. See you next time.


Comments are supposed to create a forum for thoughtful, respectful community discussion. Please be nice. View our full comments policy here.

×

Comments are closed.