Podcast: Play in new window | Download
Subscribe: RSS
In this interview with Fabien Campagne we talk about his experience with MPS in academia and industry. We cover topics like
- Fabien’s MPS books,
- How MPS helps computational biologists analyze data faster and avoid incorrect conclusions,
- How much time it takes graduate students to become productive with an MPS-based tool,
- The main misconception that potential users have about MPS-based DSLs,
- The drawbacks of the current proliferation of YAML and JSON in the industry,
- and many others!
Highlights
I think all those tools that I had done before led me to understand that when you’re doing analysis, what really matters is having flexibility about what you need to do, and having a way to express the analysis task in a very convenient way, convenient for use, data analyst. And that’s where I became interested in MPS (12’35’’).
Biologists were really responsive to that because (…) the tool gave them enough flexibility that they could ask their own questions of the data. And when you do data analysis, that’s really important. So that’s what we really worked hard to make easier for our target audience (22’45’’).
Biologists much prefer the MPS experience because we can’t mess up with syntax. There’s nothing that they can do to break the syntax of an MPS editor, because a lot of things are constant and fixed, and you can’t just by mistake add a space or a dot in the middle of something and then your program won’t compile or execute anymore (23’59’’).
I wrote the first book in order for me to have a reference, so I could give to people in the lab to help them understand how MPS was working. Because even though there was some documentation online, I still felt that it wasn’t at the right level (25’53’’).
I think MPS can really bridge that gap because it can allow you to have high level languages, and yet generate those low level descriptors making the software that has to be written to parse those things much simpler (48’12’’).
Transcript
Sergej: Welcome to Beyond Parsing. This podcast is dedicated to language engineering, creating custom languages and using them to develop domain specific software tools.
Federico: We are your hosts, Federico and Sergej, two language engineering consultants.
Sergej: Today we interview Fabien Campagne, the author of a well known two volume book about JetBrains MPS. Fabien has used MPS for many years at the bioinformatics lab he has lead at Weill Cornell Medical College. We discuss Fabien’s experiences with MPS in the domain of biology research and also in industry.
Federico: Hi Fabien, thank you very much for joining us. It’s a pleasure to have you on the second episode of Beyond Parsing. So thank you. How are you today?
Fabien: Hello, I’m doing great. Thank you for the invite to talk in your podcast.
Federico: Good. And together with us there is Sergej, of course.
Sergej: Hello.
Fabien: Hello, Sergej.
Sergej: Hi Fabien. So I will ask the first question. I’m actually quite interested by your story because as I understand you are not a computer scientist. You were trained as a chemist and yet how does a chemist come to start building computer languages?
Fabien: Okay, so it’s quite a long story because I probably worked with computers before learning chemistry. I learned programming when I was probably 15 years old, and I was developing games with friends in high school. So this was way before I even went to college to learn about chemistry. When I did reach a point where I had to go to college and I started learning about chemistry and biology and so on, I still was very interested in computers. And I tried to find a discipline where I could actually use both chemistry and the computational aspects that I was interested in. So that’s why I ended up doing a PhD in computational chemistry, which is essentially using computers to simulate chemistry, and what happens in chemical reactions, as well as proteins and so on.
Fabien: And that’s how I got the thesis in computational chemistry. But that was about the time when bioinformatics was starting to become very interesting and so during my thesis, I actually started my informatics project, which was… If you’re not familiar with bioinformatics it’s a combination. It’s an application of computer science to biology. So there’s a lot of things going on in this field. There’s data management. There’s visualization. There’s statistics. There’s machine learning. There’s all sorts of disciplines that you need to understand and use together in order to study biology with computers. And that’s how I ended up doing a postdoc after my thesis in New York City at Mount Sinai, working in the field of bioinformatics.
Fabien: Later on, I was recruited at Weill Cornell as an assistant professor and I built my own lab, developing new methods for computational biology and bioinformatics to study proteins, their functions, to start to make discoveries about biology using computational tools. So my lab wasn’t doing any experimental work. But we were using computational tools and approaches to study biology. I relied on collaborators. We were doing experiments in their own And my lab was providing predictions and tools to further the study of biology. So that’s explaining very, very briefly my background and how I got to… To try to answer your question, how does a chemist knowing a bit about computational stuff?
Sergej: So when you say study biology, what do you mean? Can you give an example?
Fabien: Sure. So I published probably like 50 papers in bioinformatics. Some of them are quite known on cited. So for instance, there’s a paper in collaboration with another lab, we discovered a new type of calcium channel into the brain. We were studying Alzheimer’s disease at the time. This channel happens to be very important not only in the brain, but also in taste buds. It’s transducing the sweet and umami taste through the taste buds. And that’s how actually you can sense glucose, for instance. So this is a key gene that’s involved in transaction of the signal, how you sense sweet taste. And how your brain realizes that you’re tasting something sweet.
Sergej: And how do you study something like this? What experiments do you do? And what kind of work?
Fabien: Quite a few experiments. So this is work that we published a while back, probably, I think in 2009. Our role as a computational lab was to actually identify this gene in a region of susceptibility for Alzheimer’s disease, because that’s what we were studying at the time. So we identified it by looking at the genomic sequence and looking for genes in that region that had particular properties that we could measure computationally in the gene and infer, it predicts from the genomic sequence. So this was prioritization of a gene in this particular region, or you can have 100s of genes in the genome. We knew from other studies that this region contain a gene of interest, and we’re trying to find that particular gene in that very large region.
Fabien: So that’s one example of the work. Later on, we did other things. There’s one application to kidney transplant I’m particularly proud of where we actually use sequencing data from both donor and recipients for transplant. And we calculate the score that indicates compatibility between the genome of the donor and the recipient. And we found that this score actually does much better than HLA, which is a historical score used to allocate transplants to recipients from kidney transplantation. We found that the score we calculated from tons of genomic information was actually very strong at predicting loss of function of the kidney in the recipients. I don’t want to go too much into the details because I think you probably are more interested in the MPS side of things. But just to give you an idea of the kind of work I’ve done.
Sergej: Of course, because it’s still interesting to understand the context and not just “we built this language for computational biology”. But for me personally, it’s interesting to know the actual domain a little bit.
Federico: And for me, it’s interesting, the fact that, I mean, me and Sergej, we mostly build languages and tools for others to use. While as I think I understand that you’re building tools, but you’re also then using them to do actual research. So you’re both a language engineer and the domain expertise in this case.
Fabien: That’s right. Yes. That’s actually probably a very important distinction. I’ve always believed that you’re building better tools if you’re actually using those tools at the same time, because otherwise, it’s hard to know exactly what matters. So in the lab, for instance, when we designed languages in MPS, for doing analysis of sequencing data that we would generate in the lab with collaborators, we always had questions we were asking about the biology and the language design was making sure that those tasks we had to perform, analyzing data, were done as efficiently and productively as possible, because we understood what we needed to do as well as how can the language actually make that simpler and more efficient.
Federico: But it’s also true that it must be very difficult to find someone with your skill set because you’re pretty proficient in two very difficult domains. Software engineering, and in particular language engineering, and biology that I cannot begin to understand. So I think it’s not easy to find someone with such comprehensive skills.
Fabien: Well, I think there’s tons of people who are good at software engineering and bioinformatics, perhaps not language engineering but there’s… Bioinformatics is a lot about tool development. So there’s quite a few labs who have people who can do this kind of work. Perhaps not as… I’ve always wanted to go further in terms of rigor for software engineering, because I consider that if you have a bug in your software you may just be misleading yourself about what you find about biology. That’s a big problem, a big concern. And I think we’re seeing more and more of those studies where their coming out where some bug in software actually invalidates a conclusion of a paper that was published about biology or chemistry-
Sergej: Yes, I have seen a couple of these in the news recently.
Fabien: Yeah. And people are starting to realize that rigor about software development and processes actually matter quite a bit. And this is something fortunately, I understood quite early on. And everything I did in my lab was to make sure that whatever we did we had good tests, making sure that the software was doing what we wanted to do. And we were able to catch problems before we put the paper out. My nightmare was having somebody else find the problems that we could have detected earlier if we had done the work.
Sergej: So what kinds of tools have you built to help you with this?
Fabien: Oh, along the years, many things. So, for instance, during my thesis, I built a tool to visualize a class of receptors that is very important for the pharmaceutical industry. They’re called G protein-coupled receptors. And during my thesis, I was building a tool using OpenGL to visualize the structure of those receptors. And this was an X/Motif application at the time. So you can see that I’ve been in this for a while. Afterwards, I-
Sergej: Must have been like 20, 30 years old.
Fabien: Yeah. After that, I built a tool using Java Web app to actually visualize the structure of those receptors in a two-dimensional page. This is a tool that has been deployed for probably 15 years now. It’s probably still in use. And even though there are better, I mean, there are more modern alternatives that I can see papers coming out about them. This was, for a long time, the application people used to do this kind of diagrams about proteins. Then there were frameworks for working with data and managing the data, making sure that we can compress large volumes of data obtained from sequencing, use them for computation in an efficient way. So there’s all sorts of tools that I’ve developed along the years.
Fabien: To get the discussion back towards MPS. I think all those tools that I had done before led me to understand that when you’re doing analysis, what really matters is having flexibility about what you need to do, and having a way to express the analysis task in a very convenient way, convenient for you as a data analyst. And that’s where I became interested in MPS. I’m not going to say that it was easy to learn MPS. In fact, I’ll tell you this anecdote about how I first learned about it. I think I found it on the web one day, and this was probably 2004, or something like that. This was one of the beta versions of MPS at the very beginning of it being public. [crosstalk].
Fabien: There was a tutorial that I tried to follow. And actually, I couldn’t finish this tutorial, because the steps that were described didn’t really match the user interface that was in front of me. So the whole exercise was an exercise in frustration at that stage. But I saw that there was something new and different about this tool because I was familiar with quite a few things at the time, and this looked very different. And at least the explanation of the tutorial sounded appealing. But because it was so difficult at the time to actually follow those steps I… And I was just experimenting on the weekend and looking at this new tool and trying to figure out how to use it.
Fabien: I dropped the matter for several years. And it’s only when I got back, stumbled again upon MPS that then the documentation was a bit stronger, and I was able to do an actual project with it. So I think that’s an example of what matters when people stumble onto a new technology and new tool. I think if you’re close to the team that developed the tool, it’s probably a lot easier to get help if you get stuck with something. But tool developers, and that’s true also of the tools we have developed, often don’t anticipate the barrier that picking up this new tool will be to somebody else. That’s not in the group. That does not have access to the people who know how to do certain things.
Sergej: Yes, it’s difficult to see the tool with the outside eyes, somehow. So what actually caught-
Fabien: It’s really hard to know, yeah.
Sergej: What made you actually think that there’s something promising in it? What made you come back to it a second time? Do you remember what that was?
Fabien: So I have developed… I was interested in coming back to it because I needed ways to generate certain kind of boilerplate that I noticed we were doing all the time. So we needed to do data analysis and the usual way you do that for a certain type of data we were collecting at the time, which is called RNA-Seq, which is just a technology that we used to study expression of genes. And this technology generates tons of data, I’m talking about probably hundreds of gigabytes of data, that you need to analyze. So there are tools to do efficient computation and just distillate and extract a table, a matrix of counts, for each gene you’re going to know how many… You’re going to get a measure of how much this gene is expressed in a particular biological sample. And a typical project can include tens of samples.
Fabien: So you end up with this matrix that tells you about those 30,000 genes I expressed across those 100s of samples in this way. And now you have to understand something about biology using this information. And there’s a lot of things that you want to do as an analyst that are quite repetitive actually. And so you would end up writing this similar code in some platform for analysis, and in biology and bioinformatics R is quite a popular language. So in my lab, I had several students and postdocs writing the same scripts borrowing from each other, and writing very similar analysis, changing the names of the samples and loading different data sets.
Fabien: And that’s when I realized that if we had a system to make this a lot easier, and if we could just expose a certain subset of the degrees of freedom, as I think about that, in those particular projects, then we would make our life a lot easier. And it’s always, thinking about reproducibility and thinking about how can we avoid silly mistakes about, this variable name is used in this way but it’s overridden here and that changes your analysis results, and you didn’t notice because this is a silly mistake, and it wasn’t obvious. So if we could generate this code from a much simpler description that you can actually understand very quickly, that would make your life a lot easier.
Fabien: So this is how I started this project. I started it with something very simple, just starting to generate one of those scripts from a high level MPS description. And that’s one application. I’m in fact skipping another project that we did before with MPS, which was more of a proof of concept of how can we actually use MPS to do certain things we wanted to do. This project was probably not as popular as the one I talk about for the data analysis, that was called MetaR, but this earlier project we published as a pre print and as a journal article afterwards. People found it interesting, but it was more for us to really start to really understand the MPS technology and what it could do for us in terms of language composition and so on and so on, and all the various features that were available, and perhaps trying to explain it more to the field so that others could understand what could be the advantages of this technology.
Sergej: And what was this project?
Fabien: This was a NYoSh project. So we have a paper on this.
Sergej: I’ve seen it on your website. The “Not Your Ordinary Shell”. For some reason the first words that come to me, is New York shell, not the other explanation.
Fabien: Yeah, sorry about that. Poor naming, I guess. I’ve always been terrible at naming. But the other project, the one that was actually used a lot more is called MetaR. And this is the one where we have this high level language where you can actually construct data analysis in MPS and generate that code to R scripts that you can run on new data sets. And this was quite popular because we started using it at Weill Cornell, where my lab was located for 15 years. We started using it and giving training sessions roughly every two weeks. And we had a lot of graduate students, technicians, and even lab heads, who took the training and started using the platform to analyze their own data.
Fabien: Because the thing to know is that when experimentalists generate RNA-Seq data, which is this technology to measure gene expression, this more modern technology, they often end up with files that they can’t really analyze. They end up with a disk and that disk contains megabytes and megabytes of compressed sequence file, and they don’t really know what to do. So what we’re doing with MetaR is provide them an option that they can actually use to do analysis on their own. So we were trying to take ourselves out of the picture by providing a tool that was flexible enough that they could ask the questions they wanted to ask.
Sergej: And how long did it usually take to train them to use this tool based on MPS?
Fabien: So we got to the point where we spent about half an hour to install the software, MPS and the plugin on their laptop, because we wanted people to be completely independent with their own laptop, to be able to do this analysis. And then another hour to go through the analysis with MPS. Which was a big improvement compared to people having to learn the R language on their own. If you’re a biologist and you don’t know programming picking up R is a huge endeavor. May take you a week of training just to reach the point where you can do something useful.
Sergej: And with MetaR they didn’t have to know R to start.
Fabien: That’s right. With MetaR they just had to learn the language we were teaching them in this session. And in one hour, we took them… We had the standard data set that were used for those training sessions. In one hour, we were able to take them from loading the data set, looking at the problems and the rules in that data set, annotating them to know what the biological samples were. And that’s important when you do analysis, you need to annotate to indicate what’s contained in those data points. And after that they were able to build a heat map just to visualize the gene expression, run differential expression analysis, which essentially compares across groups of samples to find genes that are differently expressed.
Fabien: So find differences in your data set and then plot some visualization of that, which is a heat map. So all of that in one hour. And biologists were really responsive to that because essentially, the tool gave them the right level at which to interrogate the biological data. It gave them a way… It gave them enough flexibility that they could ask their own questions of the data. And when you do data analysis, that’s really important. So that’s what we really worked hard to make easier for our target audience.
Sergej: Sounds like you gave them just the right language, and it was a huge success.
Fabien: Yeah, we probably trained a couple hundred people in using the system. And the feedback was quite positive.
Sergej: What was their experience or their feedback about using MPS itself as a tool? You know, the projectional editing-
Fabien: Well, the thing about biologists is they don’t have the usual software engineer experience. They have never used text-based programming. So for them, it’s just another tool. There’s nothing weird about it except that it’s just another tool to learn.
Sergej: They have nothing text-based to compare it to, basically.
Fabien: That’s right. So for them, in fact I would… I’ve taught the two kinds of languages, the text-based languages and the MPS-based languages with projectional editing. And in fact, biologists much prefer the MPS experience because they can’t mess up with syntax. There’s nothing that they can do to break the syntax of an MPS editor, because a lot of things are constant and fixed, and you can’t just by mistake add a space or a dot in the middle of something and then your program won’t compile or execute anymore.
Sergej: And I guess they didn’t have to use the generator or, well, actually probably MetaR generated into R, right?
Fabien: That’s right. Yes. So they were not concerned with language design at all. For them it was just purely an experience of using a tool to do something they wanted to do.
Sergej: What about version control?
Fabien: We didn’t cover that because most people don’t know about that, or really, in the first instance need that. In the lab we used version control, for our own analysis. And we used it in MPS, in the MetaR analysis, and it’s an extremely important tool. But in the first session we were giving we didn’t cover that, because first you need the people to start using the approach. And then we had some advanced sessions where we actually told them about that. But I don’t think that too many people were picking up on this approaches.
Sergej: But is this also the time, I’m sorry Federico. I will let you ask your question next. But is this also the time when you started to write the books about MPS?
Fabien: Yes, that’s approximately… So while I was working on MetaR, which was a project that lasted a couple of years, I actually started writing the first book pretty much on the weekends, because that’s when you have more time, at least as an academic, to do things like writing books. I spent a lot of time writing articles and writing grant applications as an academic. So essentially, I wrote the first book in order for me to have a reference, so I could give to people in the lab to help them understand how MPS was working. Because even though there was some documentation online, I still felt that it wasn’t at the right level. It wasn’t helping me… I had to do a lot of work to figure out all the things that I put into the book, and I wanted to have a record of those things I had figured out and I wanted to make it easier for others to understand.
Fabien: Because my impression, having worked with MPS for a while now is that the tool is one thing but unless you document it very well it’s very hard for newcomers to get into the technology. And this is a message I’ve been trying to give to the community and to people at JetBrains who are building MPS. I’ve tried to explain that documentation is essential. So the book, the first volume and the second one were my attempt at filling a gap. I felt that there was a gap, that if you are a newcomer to MPS you just didn’t have enough to get you started in the smooth and easy way to the point where you could do things on your own and figure out the rest.
Sergej: From my point of view, you actually filled that gap perfectly because I myself have used your book to learn MPS at first and I found it very, very helpful. I appreciate your writing it… And publishing.
Fabien: I’m glad it was helpful.
Federico: Yeah, they were also very useful for me. And I was wondering if you get any feedback and if you were able to understand how they were received, because I think my experience that many times maybe someone publish something, there are people out there using it, but you don’t hear back from them, especially if it works fine. So did you hear any opinion from your readers?
Fabien: I got a little bit of feedback at the beginning when the first versions of the books came out. I think most of the feedback was positive. There were a few people who emailed me because they found some typos or something that wasn’t clear. So I addressed that in revisions. But so far, I think people were quite happy that there was such a resource. I receive questions sometimes about, are you going to revise it for the latest version of MPS? So I guess-
Sergej: That would have been my question as well.
Fabien: So I guess that shows that people find it useful. And so my answer to that is that I would like to revise it for a new version of MPS. But revising a book is quite an important effort and because I would do that only on the weekends, and I usually write mostly in the winter when the weather is not that nice in New York. I wait for major changes to MPS for actually revising the book because my impression is that, that’s probably enough in the current version that you can pick up enough about MPS and then learn the new things on your own. But if something like, I’ve heard recently that there’s a new type system coming, that would be worth an update to the book, because that would be very different from the current type system chapter in the book.
Sergej: Yes, that’s true.
Federico: One thing that I keep hearing, and I would like to hear your opinion on that is that somehow modeling in general and DSLs also tends to be more popular in Europe, while you are promoting them in US but on the other hand you’re European. So I was wondering if you had any comment on that, if you think that they have the same level of popularity or there is some difference between Americans and Europeans when it comes to-
Sergej: Or maybe we just don’t hear about the US or American efforts here in Europe.
Federico: Could be true also.
Fabien: I think it is very related to the point I mentioned that you have to be close to the developers of the technology to really be able to pick it up quickly. I think I would agree that I see a lot of activity happening in Europe, I see a bit less in the States. But I think the reason is because it’s not that easy to get in touch or to sit together with somebody who’s developing those approaches, and get help when you get stuck and things like that. So I managed to learn this technology on my own and then I was able to teach others in my lab and to teach colleagues. And I’m continuing to do this now in my industrial job. I teach colleagues in the company, and perhaps we can touch on those new projects in a moment.
Fabien: But there, the same thing happens, people if they try… First of all, if you try to explain what you do with MPS with just snapshots of what the screen looks like, I think a lot of people just are scared with what they see, because if you think about what they see, they see a lot of text on that page that, if you’re only used to textual editors, you think you have to type.
Sergej: Mm-hmm (affirmative). Right.
Fabien: And this is a misconception that I can’t really easily correct by writing about… Even if I tell people, “Oh, by the way, everything you see on that snapshot you don’t have to type yourself.” It’s not going to become clear to people until they sit for a demo.
Sergej: Yeah.
Federico: I mean, in a way, it’s like filling forms, and languages give also, structure to you, right? That maybe-
Fabien: That’s right. And we understand that because we have sat through a demo of MPS. And now we do things in MPS. And as soon as I do that demo to colleagues, then they’re like, “Oh, yeah.” As soon as we’ve tried it themselves. As soon as they went through a tutorial they understand, and that barrier, this fear of, “Oh, I have to write all those things that I see as a snapshot,” is gone. But you need to get to that level. And the only way I have found that is effective is actually giving a tutorial and engaging people and making them try themselves.
Federico: Okay. So in a way, the language guide you, it’s like if it was asking you questions instead of you trying to formalize whole entire system, like you would do using a more typical programming language where you start with a blank file and you have to think how you organize your thoughts.
Fabien: I totally agree with you. I’m just saying that if you’re learning in isolation, and for instance, somebody just downloads MPS, and they try to understand how it works, it will be very hard unless we have guidance and it would be so much easier if you have somebody showing them the first time how to use it. And I think that explains why it’s more popular in Europe because you have a network of people who meet in person and can show to another in the US because the distance makes it harder to do that.
Sergej: Right. That’s an interesting observation. Actually I haven’t thought about it like that. So you mentioned, you are in an industry job now and that you are continuing to teach MPS to your colleagues. Can you touch on that a bit?
Fabien: Right. So I joined a company that is selling advertisements, and I’m on the web-based advertisements. So it’s a platform for ads. And what I’m doing here is I’m actually building AI and deep learning models for detecting fraud, detecting whether advertisements are political in nature. And we need to collect information from the people who actually submitted those ads to the platform, and other kind of information that we need to predict for the platform. So that’s my main job. But as part of this job I’m building infrastructures to actually make those models, to deploy them in production. The way we deploy models in production here in the team where I work, is we actually build Kubernetes deployments.
Fabien: So if you’re not familiar with Kubernetes, if you know Docker, for instance, Kubernetes is the system that allows you to deploy containers and orchestrate the deployments. So you can construct an entire web applications, database, back ends and so on. And Kubernetes helps you set up this configuration so that you can just say, “Okay, deploy this app!” And all the pieces and the parts, all the components of this app will be configured automatically for you and deployed and monitored, and you will know that the things are running.
Fabien: So Kubernetes has been open-sourced by Google several years ago. I remember a meeting I attended where there was a presentation about Kubernetes. And it’s been very helpful, a lot of large companies are using it. It’s probably not something your startup wants to use because there’s quite a bit of overhead in learning Kubernetes and setting up a cluster and so on. But it’s extremely useful for production systems, robustness and making sure you can scale things horizontally and add more replica of the application as demand increases and so on. The problem is, when I joined this company I had to learn Kubernetes, I had to figure out how to use it. And what I found is that everything you do has to be written in YAML.
Fabien: So there’s a lot of very boilerplate information that has to be written about those deployments. And most of them look very similar. There’s a lot of copy pasting going around. People borrow a YAML file and then they start making changes and so on. And then I realized that MPS would be a perfect platform for doing configuration of Kubernetes apps. Because if we can generate the YAML, we can have high level languages that really distill what you need to enter to configure a new application. So I started building systems to do that. I did this on my own initially. And when I had a system that was good enough to actually do support deployments to Kubernetes then I started showing it to other colleagues.
Fabien: And now more people are using it because it’s simpler. It’s much simpler to actually create deployments from Kubernetes using the high level languages I built in MPS than using raw YAML because those files can be quite huge. And YAML and indentation and when you have several levels of indentation, things can be really messy. So that’s how I’m using MPS at the moment. I’m using it for generating YAML. I use this for configuring Kubernetes but also for building application. I generate composed build scripts, for instance, from a very simple description that describes what application needs to be built, what kind of Helm Charts need to be built.
Fabien: Helm Charts are components that go into Kubernetes deployments, and help with lifecycle of an application. So all of that, I built a system where in MPS where I can configure all those artifacts in MPS and then generate the project, and pretty much I can run a few command lines and then the app is deployed to a development system and then it can be moved to a production system.
Sergej: Out of curiosity did you use MPS just because you are familiar with it or did you consider other tools as well?
Fabien: I definitely used MPS because I was familiar with it, because for me now, after writing those books and working on multiple projects, it’s very simple to start a new project with MPS. I mean, considering other tools is something I would do if I saw another great tool coming out, but then it would probably take me six months to get to the point where I can do similar things with it than I can do with MPS.
Sergej: That’s right. So now you are showing it to developers or your DevOps colleagues, right?
Fabien: So mostly it’s in the team. I’m sharing it with other software engineers in the team. DevOps, unless I sit down with them and give them a tutorial, it’s the same problem: if I showed just snapshots they don’t quite get it. So I think I will need at some point… I’m going to give a talk next month about the approaches. And then I’ll continue doing tutorials for people who are interested in starting to use the system. And then we’ll go with organic growth.
Federico: I think this is quite interesting, this use case because we got the impression that many cases, the ideal user, but DSLs and tools built with MPS are people that are not developers, people that do not have an alternative, while the tools that you provide are targeted to people that could just write the configuration file. So I think you need to build a tool that is much better because they have alternatives. So unless this works very well, they could resist adopting it maybe.
Fabien: That’s right. It’s absolutely true. And that’s why I waited before showing that tool to colleagues until I had something that worked well enough that I could be confident that they would be able to generate something that works.
Sergej: What was their reaction?
Fabien: Sorry. Go ahead.
Sergej: What was their reaction? Are they skeptical or positive?
Fabien: So I think it’s the same reaction, before I go through a tutorial, they’re scared of the snapshots, they see the snapshots, they see everything that is used on those screens and they fear that they will have to enter all of that, and remember that syntax. When I show them that the syntax is actually given and they don’t have to think about it and they just have to fill in the slots that are available and can be edited, then things change, then they realize how much flexibility there is and that they always generate correct YAML, they don’t get ever an indentation error. This is a big deal because the feedback I got from others who have built those deployments in Kubernetes is that as soon as you start getting to the eighth level of indentation in YAML things start to be a bit tricky.
Fabien: So the first thing I did, and that took me about two days in a hackathon, is to actually model YAML and JSON in the language so that I could have editors, I could copy paste YAML into MPS. And I could just generate that directly. And this takes out all the syntax aspect, and now you can focus on higher level language that really express the semantic of what you’re trying to do. And then the downstream aspects are guaranteed to work because this… I haven’t changed the generation of YAML in a while now because it’s stable.
Federico: Of course. Following up on what Sergej was asking you before, the fact that you are invested in MPS in a way, because you spent all the time learning it. So I was wondering if there is any problem that you see in MPS? Any feature that you think you will need to get if you want to keep using MPS? Or are you perfectly satisfied with how it works?
Fabien: Well, I’m never satisfied with any tool. I always want more of the tools I’m using, I always think that something could be improved. That’s probably just my character, but that’s how I think. One thing that I think would be very helpful, and I’ve seen steps being taken about that, is there’s a ton of old bugs that need fixing in MPS. And that’s been my experience with many, many versions of MPS. And I think JetBrains is finally starting to move in that direction.
Sergej: Right. The current release is supposed to be focused on bug fixing.
Fabien: That’s right. I think this was needed. At some point you need to stabilize the platform and address all those little annoyances that people have reported that have not yet been fixed. So I think that’s a good one. That’s an important one. The other thing that everybody’s talking about, and I would really love to see, is a web-based way of using those languages that we’re building. I think this would open up a lot of usage for the platform. But I realize it’s a hard one. And this is something that I’m not sure how this would happen or work.
Sergej: Apparently, just bringing MPS to the web is not the problem, doing the concurrent editing thing seems to be the biggest complication.
Fabien: So anyway, it’s definitely something that I think would be extremely useful to the community and would allow other people to experiment with the platform and actually see for themselves, because I feel that the current MPS is very intimidating if you’re completely new to it and you’re used to other tools. And the knowledge shares a lot with the IDEA platform, and a lot of software engineers are used to this platform. It still feels very foreign when you use it for the first time.
Sergej: Right.
Federico: Yeah, and I think that sometimes what it feels, it contributes to give you the impression that MPS is complicated, and not just the editors but the whole idea around the editor, like you see, it seems stupid but you see a gazillion menus, you see complicated project views and you have the perception, this is a very technical tool, while the languages maybe are pretty simple. But still the infrastructure around give you this impression.
Fabien: Yeah, it’s true. I agree with that. I think it could be useful to have a way to just publish those models in a way that people could use without all the distraction. This is something I experienced when I tried to teach MPS to biologists, they essentially had to learn to ignore a lot of things in this user interface. Because what they needed to do did not require all those menus and things like that. Which is okay. I mean, people can learn to ignore a lot of things. But generally, it makes for a better user experience if you’re just presented with the things that actually matter to you.
Federico: Becomes a little bit like black magic. It works but you don’t know why there are a lot of things that you didn’t touch because you’re not sure what will happen. It reduce the trust, I think.
Sergej: Okay, I think our time is almost up. It’s probably time to wrap up this episode. So Fabien, do you have anything else to add? Anything we didn’t ask you about?
Fabien: I think, just one brief thing, which is a comment on something I’m seeing in the field of software engineering. I see a lot of focus nowadays on very simple languages, like, JSON and YAML for configuring complex systems such as, Concourse builds, Kubernetes deployments, Spinnaker for deploying Kubernetes apps to multiple clusters. And even though I understand why the app developers and software engineers go for very simple languages, because we don’t have to worry about parsing. I think that puts a lot of work on the human, on the engineers to actually craft those descriptors, those specifications in those very simple languages that don’t really give you a lot of support for editing.
Fabien: So I think MPS can really bridge that gap because it can allow you to have high level languages, and yet generate those low level descriptors making the software that has to be written to parse those things much simpler. So that’s just my view of where the industry is heading. And even though I understand why people go there, I think they forget that by simplifying their software we’re making the life of software engineers harder, because now you have all this cognitive load that you need to be cognizant of the syntax of this very simple language that you have to twist to your needs. And having a much better user interface and intentions and things like that can really make a world of difference when you need to configure complex systems.
Sergej: Yeah. So thank you Fabien. Thank you for coming. And thank you for your observations. I found some of them were a little bit unexpected. And I must say, pleasantly unexpected. So thank you for coming, for having us think about those things.
Federico: Yeah. Thank you.
Fabien: Yeah. You’re welcome. Thank you.
Sergej: Thank you for listening to this episode of beyond parsing. We are your hosts Federico and Sergej, two language engineering consultants.
Federico: If you want to learn more about domain specific languages and language engineering visit BeyondParsing.com.