LLVM Tutorial #17: The Φ (Phi) Function - Toby Ho

tobyho.com » LLVM Tutorial #17: The Φ (Phi) Function Published on Jun 20th 2020Duration: 24:55 • Watch on YouTube

In this series I walkthrough the LLVM "Kaleidoscope" Tutorial, where you follow step by step to create your first programming language frontend using LLVM as the backend. Last time we wrote the lexer and the parser for the if statement. This time, we talk more about SSA, and more specifically, about the Φ (Phi) function.

Transcript

The following transcript was automatically generated by an algorithm.

  • 00:00 : hey it's Toby welcome back to the LLVM
  • 00:02 : tutorial walkthrough we're in the middle
  • 00:06 : of chapter 5 we just built the parser
  • 00:10 : for for the if statement or I should say
  • 00:15 : the if expression we weren't able to
  • 00:17 : test the parser yet not suppose whether
  • 00:21 : we're supposed to do that but before
  • 00:23 : doing that I took a detour because the
  • 00:26 : tutorial has has taken one we're going
  • 00:30 : to follow it it's telling us that if we
  • 00:35 : write an if statement like this in the
  • 00:40 : Kaleidoscope language and we compiled it
  • 00:43 : into LLVM i our code that code should
  • 00:47 : look like this let's dig into what this
  • 00:51 : means for a bit and it tells us about
  • 00:55 : this nifty feature of the LLVM opt tool
  • 01:00 : opt stands for optimization I believe
  • 01:05 : everything is abbreviated so I think
  • 01:09 : people just naturally learn these
  • 01:12 : abbreviated names like opt and LLVM as
  • 01:16 : as I believe stands for the assembler so
  • 01:21 : let's let's follow this so it's saying
  • 01:23 : if you copy this code and and put it
  • 01:27 : into a file and call it tol gosh why
  • 01:32 : does it have to be T dot ll what does T
  • 01:35 : mean I'll call it something else I'll
  • 01:38 : call it if example ll I paste it into
  • 01:46 : this file and then I run LLVM - yes and
  • 01:54 : you get my terminal
  • 01:57 : okay oh l vm - yes do I have that yes I
  • 02:01 : do
  • 02:04 : hello p.m. - yes less than sign that
  • 02:10 : means tick the file and type its content
  • 02:15 : into the standard input of LLVM yes I
  • 02:20 : caught it if example dot at all and then
  • 02:24 : took the output of LLB mas I want to
  • 02:27 : just see what is the output of we're
  • 02:30 : attempting to print out a bit code file
  • 02:33 : this is in advisable as it may cause
  • 02:37 : display problems if you really want to
  • 02:40 : taste it I'll help you I'm bit code
  • 02:42 : first hand you can force output with the
  • 02:45 : F option okay let's do it man let's do
  • 02:49 : it
  • 02:49 : oh okay yeah it's kind of
  • 02:54 : incomprehensible agreed but we're not
  • 02:58 : actually going to try to display it
  • 03:00 : because I'm gonna pipe it to the opt
  • 03:04 : program huh and with the analyze flag
  • 03:11 : with the view CFG flag and then it's
  • 03:20 : printing an analysis oh wow
  • 03:25 : why did it open up sublime text it
  • 03:31 : created this digraph file which I think
  • 03:35 : digraph is a program I can download I
  • 03:39 : believe which then will take this file
  • 03:42 : and render that visual for us I'm a
  • 03:46 : little bit sad the saying of window will
  • 03:50 : pop up and I should see this but a
  • 03:52 : window did not pop up
  • 03:58 : did it generate a file
  • 04:08 : it did not I didn't degenerative foil it
  • 04:13 : just unclear where it put it oh it put
  • 04:16 : it inside this okay and it's called dot
  • 04:22 : file and I may need a program in order
  • 04:29 : to open a dot file how do I open a dot
  • 04:36 : file what program do I use to open a dot
  • 04:39 : file it's to graph this I need to graph
  • 04:47 : this
  • 04:48 : do I have graph of this
  • 04:54 : a graph this might be what I want
  • 05:01 : another way to get this is called F dot
  • 05:04 : view C F G or F dot view CFG only where
  • 05:08 : F is a function pointer it's by
  • 05:11 : inserting actual calls into the code and
  • 05:13 : recompiling or by calling these in the
  • 05:16 : debugger OPM has many nice features for
  • 05:19 : visualizing various graphs that's
  • 05:21 : actually really cool and that is an
  • 05:25 : aspect of compilers that I haven't
  • 05:30 : really dug into that much so getting
  • 05:32 : back to the generated code is fairly
  • 05:34 : simple I will try to get the graph
  • 05:40 : visualization working maybe offline and
  • 05:45 : then when I get it working then I'll
  • 05:48 : show it but let's keep going for now
  • 05:50 : getting back to the generator code it is
  • 05:53 : very simple to entry block evaluates the
  • 05:56 : conditional expression X in our case
  • 05:58 : here so that's the conditional
  • 06:03 : expression and it's evaluating this it's
  • 06:11 : comparing the double to zero because if
  • 06:15 : you if you use just a number as the
  • 06:18 : expression then in most languages zero
  • 06:22 : is false E and anything that's not zero
  • 06:25 : is truth e so that's why it's just
  • 06:28 : comparing that number using the F comp
  • 06:31 : instruction
  • 06:33 : it stands for floating-point compare to
  • 06:38 : compare I don't know why there's this
  • 06:42 : one here what that stands for but it's
  • 06:45 : comparing the value of x to zero and
  • 06:50 : then it's gonna branch be R stands for
  • 06:53 : branch it's gonna branch if the result
  • 06:57 : of the compare was one I believe it's
  • 07:01 : gonna branch to the den block if that
  • 07:05 : was true T and branch it to the else
  • 07:08 : block
  • 07:08 : was not truthy and now it's gonna code
  • 07:11 : to this branch or this branch and then
  • 07:13 : after both of them finished they joined
  • 07:16 : back to this if can't
  • 07:18 : if continued block and then we'll have
  • 07:22 : this five function let's read about the
  • 07:24 : five function the five function is a big
  • 07:27 : deal in LLVM and it's kind of weird
  • 07:32 : because in programming school people
  • 07:35 : don't say don't tell you if you write an
  • 07:38 : if statement there's a five function
  • 07:40 : no this is a construct that was
  • 07:45 : introduced by SSA we talked about SSA in
  • 07:50 : the previous video we'll talk about more
  • 07:56 : excuse me
  • 07:59 : so the Phi operation if you're not
  • 08:04 : familiar with SSA this is Stephanie who
  • 08:08 : go read about SSA and and in fact I have
  • 08:12 : read about this on the Wikipedia article
  • 08:16 : which is quite a good read about it
  • 08:22 : excuse me apologize but um basically the
  • 08:32 : premise of SSA again is that if you have
  • 08:36 : multiple assignment to the same variable
  • 08:39 : suggest this case that's actually not
  • 08:41 : allowed in SSA form so how do you deal
  • 08:47 : with any arbitrary program that well
  • 08:50 : you convert a program that would do
  • 08:52 : something like this to SSA form first
  • 08:57 : why do we convert it to SSA form in the
  • 09:00 : first place because when the program is
  • 09:03 : written in SSA form it allows for a
  • 09:07 : variety of compiler optimizations which
  • 09:10 : might otherwise be difficult or
  • 09:12 : impossible so that is why we convert it
  • 09:15 : to SSA form first when you're generating
  • 09:18 : bytecode for
  • 09:22 : for LLVM ir it is assumed actually
  • 09:27 : there's no you have no choice you have
  • 09:29 : to generate that byte code in SSA form
  • 09:32 : which is to say this code here is
  • 09:36 : already in SSA form it you have to
  • 09:40 : conform to the SSA form you have no
  • 09:42 : choice if you're going to be using LLVM
  • 09:45 : to do this and then and then from LLVM
  • 09:49 : perspective oh wow it's already in the
  • 09:52 : SSA form great it can do of the killer
  • 09:55 : compiler optimizations that it's really
  • 09:58 : good at doing so anyway so which begs
  • 10:03 : the question if you have a program like
  • 10:06 : your user of your program and you wrote
  • 10:08 : a program like this and they need to
  • 10:13 : compile it what do you do well you have
  • 10:16 : to convert this to SSA form first how do
  • 10:19 : you do that you basically rename for
  • 10:25 : every single assignment you have to the
  • 10:27 : same variable you rename it to like you
  • 10:33 : suffix the name with a number that's how
  • 10:36 : you rename it so we have instead of Y
  • 10:38 : and y we have y 1 and y 2 which are
  • 10:41 : different variables so it says you just
  • 10:45 : converted the program to this which is
  • 10:48 : well you can now see well in the third
  • 10:52 : line becomes X 1 equals y 2 and it
  • 10:57 : actually now has nothing to do with who
  • 10:59 : I want which is an interesting property
  • 11:01 : of this but anyway let's take a program
  • 11:03 : they have an if statement because it
  • 11:06 : gets more complicated when it has an if
  • 11:08 : statement right so if you have an if
  • 11:10 : statement that says oh if X is less than
  • 11:13 : 3
  • 11:14 : I mean let me actually write this out so
  • 11:19 : this I think the code of this program
  • 11:22 : would be something like Oh x equals y x
  • 11:26 : equals X minus 3
  • 11:29 : if X is less than 3 then y equals x
  • 11:35 : times 2 and W equals y else oh this is
  • 11:44 : not just rainbow color business ok else
  • 11:52 : this block which does y equals X minus 3
  • 11:56 : and then after that if the house we're
  • 12:01 : gonna end and then we're gonna do W
  • 12:06 : equals X minus y in Z equals x plus y so
  • 12:14 : this is what this program is helping you
  • 12:17 : visualize here now we're gonna rewrite
  • 12:24 : this program now actually this might be
  • 12:29 : easier if I wrote this in my text editor
  • 12:31 : so let me do that
  • 12:43 : and we close out supplying text okay so
  • 12:52 : this example maybe I actually called it
  • 12:59 : untitled one okay so let me write this
  • 13:03 : out in this text editor so this is x
  • 13:06 : equals five x equals X minus 1 is if X
  • 13:12 : is less than 3 then Y is equal to x
  • 13:18 : times 2 w is equal to Y else y is equal
  • 13:25 : to X minus 3 and let's just say and more
  • 13:31 : like Ruby and then W is equal to X minus
  • 13:34 : y and z is equal to X plus 5 okay let's
  • 13:40 : convert this guy to SSA form well in SSA
  • 13:44 : form first of all we're not supposed to
  • 13:48 : assign the same inside multiple times to
  • 13:52 : the to the same variable so the second
  • 13:56 : time this has to become a different
  • 13:58 : variable from the first one so SSA form
  • 14:03 : of this program is going to look like X
  • 14:09 : 1 equals 5 and I'm also referring to
  • 14:13 : this flow graph which they have the
  • 14:16 : variable names labeled as well so you
  • 14:19 : can refer to that one if that one helps
  • 14:22 : you understand it so the second
  • 14:27 : assignment to X we have to assign to X 2
  • 14:30 : instead of X 1 and then now if X less
  • 14:36 : than 3 this X is going to refer to X 2
  • 14:39 : because X 2 is the last X that was a
  • 14:43 : scientist so if X 2 is less than 3 then
  • 14:48 : again we're going to refer to X 2 and
  • 14:51 : then
  • 14:51 : now this is gonna be extreme I believe
  • 14:55 : wait oh I messed this up this should be
  • 15:00 : why here okay never mind that
  • 15:04 : so this would be why there's y1 and y2
  • 15:09 : though so y1 is gonna be x 2 x 2 and
  • 15:14 : then we have ww1 and we're gonna just
  • 15:18 : label all the variables whether or not
  • 15:20 : it actually needs it but I think yeah
  • 15:23 : most of them so W 1 is gonna be Y 1 here
  • 15:28 : and then else
  • 15:29 : why Y 2 this time because here we
  • 15:39 : already assigned to Y here we're
  • 15:41 : assigning to Y again so the second time
  • 15:43 : it has to be a different version of Y so
  • 15:45 : Y 2 and as for X we're gonna use X 2
  • 15:51 : that's the last X that was assigned to
  • 15:53 : you so X 2 minus 3 that's what they have
  • 15:57 : here as well I'm gonna end it and then
  • 16:00 : outside of the way I messed up here -
  • 16:04 : this should be W ok and then here it
  • 16:09 : should be W 2 here because we already
  • 16:13 : assigned to W W over here so the second
  • 16:17 : assignment to W has to go to a different
  • 16:19 : W so W 2 is equal to X 2 the last time
  • 16:27 : we assigned your X was X 2 minus y now
  • 16:33 : the question is which Y which why are we
  • 16:38 : are we using for this why when we say W
  • 16:43 : equals X minus y and that's the problem
  • 16:47 : because in this branch we assigned to Y
  • 16:51 : 1 but in the else branch we assigned to
  • 16:54 : Y 2 so it could actually be either Y 1
  • 16:57 : or Y 2 depending on which branch we took
  • 17:02 : for the if statement
  • 17:04 : and which is why this diagram says wide
  • 17:08 : question mark which why and also the
  • 17:12 : next one is Z where sine x + y - Z well
  • 17:17 : X there's only one possibility for X
  • 17:19 : because we didn't assign to X within the
  • 17:23 : branching with an either branch so X has
  • 17:26 : to be X - but why we don't know we don't
  • 17:30 : it could either be Y 1 or Y 2 so what
  • 17:37 : are we gonna do about it well the
  • 17:38 : solution in in the SS a parlons is a Phi
  • 17:47 : function we have to insert a Phi
  • 17:51 : function which will pick either Y 1 or Y
  • 17:56 : 2 and the way you it picks is based on
  • 18:00 : whether you took this branch or this
  • 18:02 : branch so Phi they use this Phi symbol I
  • 18:07 : wonder if I can find the 5 symbol I just
  • 18:11 : copy from here so Phi of Y 1 y 2 Phi is
  • 18:19 : gonna pick either Y 1 or Y 2 and we'll
  • 18:28 : just assume that fine knows whether you
  • 18:32 : took this branch versus this branch and
  • 18:35 : if you took this branch fire is gonna
  • 18:37 : pick Y 1 if you took this branch fire is
  • 18:40 : gonna pick Y 2 is it does say that in
  • 18:45 : once the compiler actually generates
  • 18:49 : code out of this form and produces
  • 18:54 : assembly code or machine code this fire
  • 18:58 : function is not actually implemented as
  • 19:01 : a conditional statement because that
  • 19:03 : would you know you would take a
  • 19:05 : performance hit if it had to do that
  • 19:07 : instead what it usually does is like
  • 19:11 : there's actually a there's actually like
  • 19:15 : a y3 variable Oh
  • 19:17 : actually assigns a y3 variable first and
  • 19:22 : then references the live 3 variable like
  • 19:27 : that so yeah it's sort of it sort of
  • 19:32 : does this more or less you can you can
  • 19:40 : imagine it it's it's doing it like that
  • 19:44 : in reality but um but when we write the
  • 19:49 : when we write the code in SSA form this
  • 19:52 : is what we should write and the five
  • 19:56 : function is something that LLVM has in
  • 20:00 : its syntax okay so and there it is so
  • 20:06 : Phi is actually an instruction in LLVM
  • 20:10 : ir that's the Phi so this instruction is
  • 20:14 : saying okay I want to I want to create
  • 20:19 : this variable called f10 and I want its
  • 20:22 : value to be either Caulton if if we came
  • 20:29 : from the done block which is here or I
  • 20:33 : wanted to be caught and one in the event
  • 20:38 : that we came from the else block okay
  • 20:42 : and LLVM will actually verify that in
  • 20:46 : fact it's possible for you to reach this
  • 20:49 : point from the else block which is
  • 20:51 : because this branching statement is the
  • 20:54 : one that causes the else block to go to
  • 20:57 : the if if can't block if can't standing
  • 21:02 : for if continued right so that's how
  • 21:06 : this stuff works
  • 21:07 : it seems a little weird and because this
  • 21:11 : Phi concept is sort of a construct that
  • 21:15 : is serve comes out of nowhere it's not I
  • 21:20 : mean it's not an instruction that your
  • 21:23 : CPU understands Phi is not an actual CPU
  • 21:27 : instruction is this construct
  • 21:30 : that is in the LLVM ir language and it's
  • 21:34 : it's part of the single the static
  • 21:37 : single assignment form language in a
  • 21:39 : sense and because our Mir is in SSA form
  • 21:45 : we have this Phi instruction that we
  • 21:49 : need to use in the in the event we are
  • 21:51 : doing something like this with if
  • 21:53 : statements so so yeah
  • 22:00 : [Music]
  • 22:11 : so let me read this paragraph so at this
  • 22:15 : point you're probably starting to think
  • 22:17 : oh no this means my simple and elegant
  • 22:20 : Fran and we'll have to start generating
  • 22:21 : SSA form in order to use LLVM
  • 22:25 : fortunately this is not the case and we
  • 22:28 : strongly advise not implementing an SSA
  • 22:31 : construction algorithm in your front-end
  • 22:34 : unless there is an amazingly good reason
  • 22:36 : to do so in practice two sorts of values
  • 22:40 : that float around in code written for
  • 22:42 : your average imperative program language
  • 22:45 : that might need fine notes there are two
  • 22:48 : two sorts of values there's code that
  • 22:53 : involves user variables so I guess
  • 23:00 : something like this I don't understand
  • 23:07 : what they mean by user variables exactly
  • 23:09 : but anyway I think that probably just
  • 23:12 : means some variable that you're gonna
  • 23:14 : assign to more than once like your loop
  • 23:17 : counter values that are implicit in the
  • 23:21 : structure of your ast such as the final
  • 23:25 : in this case
  • 23:33 : okay in Chapter seven of this tutorial
  • 23:36 : mutable variables who will talk about
  • 23:38 : number one in that for now just believe
  • 23:42 : me that you don't need SSA construction
  • 23:45 : to handle this case for case two you
  • 23:48 : have the choice of using the techniques
  • 23:51 : that we will describe for number one or
  • 23:55 : you can insert final directly if
  • 23:58 : convenient in this case it's really easy
  • 24:01 : to generate the final so so we choose to
  • 24:08 : do it directly in this case in which
  • 24:12 : case okay I guess in this case means in
  • 24:20 : the case of the code we're gonna
  • 24:23 : implement in this in this chapter
  • 24:27 : okay alright I hope that was a good
  • 24:32 : explanation of five to five function and
  • 24:37 : SS a and I don't know where the next
  • 24:43 : step is going yet but I'm gonna stop
  • 24:46 : this episode here and then in the next
  • 24:49 : one we'll start building the code
  • 24:51 : generator

Từ khóa » Phi Llvm