[ cyb / tech / λ / layer ] [ zzz / drg / lit / diy / art ] [ w / rpg / r ] [ q ] [ / ] [ popular / ???? / rules / radio / $$ / news ] [ volafile / uboa / sushi / LainTV / lewd ]

λ - programming

/lam/bda /lam/bda duck
Name
Email
Subject
Comment
File
Password (For file deletion.)

BUY LAINCHAN STICKERS HERE

STREAM » LainTV « STREAM

[Return][Go to bottom]

File: 1442750574870.png (1.3 MB, 1200x1800, 1433792574974.png) ImgOps iqdb

 No.9620

I failed to see a topic on this already, so I thought I would start one on programming language design and implementation. It seems the Compiler discussion atrophied so I thought I would add implementation into the mix.

== Resources ==

> LLVM Language Implementation Tutorial

http://llvm.org/docs/tutorial/index.html

> Implementing Programming Languages

http://www.cse.chalmers.se/edu/year/2012/course/DAT150/lectures/plt-book.pdf

> CPython internals: A ten-hour codewalk through the Python interpreter source code

http://pgbovine.net/cpython-internals.htm

> Compilers: Principles, Techniques, and Tools (2nd ed)

https://mega.nz/#!sp8BUSRL!okEt_Eo3h7RtvfZm6bIcmRuZqMo0fiJky9MIye4Tz7U

> Essentials of Programming Languages (3rd ed)

https://mega.nz/#!Z8cEHDIS!L96la8aeeai2sZx7Nij4XPlxNUpaKGznKcxrNrIYpjI

> Modern Compiler Implementation in C

https://mega.nz/#!Y5tC1IhA!sesyVwxWBMG0TC1R9C6DuOqzWhUG4XmhIsOlGbPSHTQ

> Compiler Optimization and Code Generation

http://bears.ece.ucsb.edu/class/ece253/compiler_opt/c1.pdf

> Garbage Collection Algorithms

http://courses.cs.washington.edu/courses/csep521/07wi/prj/rick.pdf

More to come
>>

 No.9621

File: 1442754002361.png (11.14 KB, 900x532, repl.png) ImgOps iqdb

>CPython internals
lel.

Structure and Interpretation of Computer Programs
http://sarabander.github.io/sicp/

Programming Languages: Application and Interpretation
http://www.plai.org/

Object-Oriented Programming Languages: Application and Interpretation
http://users.dcc.uchile.cl/~etanter/ooplai/

The Lambda Calculus for Absolute Dummies
http://palmstroem.blogspot.be/2012/05/lambda-calculus-for-absolute-dummies.html

AN INTRODUCTION TO FUNCTIONAL PROGRAMMING THROUGH LAMBDA CALCULUS
http://www.cs.rochester.edu/~brown/173/readings/LCBook.pdf

Types and Programming Languages
http://port70.net/~nsz/articles/book/pierce_types_and_programming_languages_2002.pdf

Good Ideas, Through the Looking Glass
https://www.inf.ethz.ch/personal/wirth/Articles/GoodIdeas_origFig.pdf

Virtual Machine Showdown: Stack Versus Registers
https://www.usenix.org/legacy/events/vee05/full_papers/p153-yunhe.pdf

The Implementation of Functional Programming Languages
http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/index.htm

Quantifying the Performance of Garbage Collection vs. Explicit Memory Management
https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf

>>

 No.9631

File: 1442780312243.gif (1.97 MB, 320x240, monkey.gif) ImgOps iqdb

Not really related to topic:
How much of a moron must one be to build a LL(*) parser using C++ OOP?

Joking aside, the basic premise is the following: While walking the parse tree the interpreter changes behavior by switching its strategy for handling the input based on past output.

It works as expected but this whole construct feels like kicking dead whales.
Would it be better to just scrap that idea and stick to parser generation tools?

>>

 No.9641

>>9631
Tools like bison/flex are nice and all mainly due to the fact that you don't need to focus on the dispatch tables and such, seeing as they implement the lexing and parsing using one big state-machine. But for the most part when you use a tool such as that you are locked into the workflow of the output of that tool in the case of bison/flex it is rather grimy.

On the other hand you could use a tool like re2c where you have a bit more control as it only generates the minimal code needed for lexing and everything else is up to you.

I've used bison/flex for a basic interpreter but the bootstrapping needed to get it to do anything was much more of a pain in the ass that hand-writing one. Although the code would arguably run faster.

Also on the implementation side, using pure C++ OOP (even though C++ is not a pure OOP language) parser written by hand seems kinda fuzzy.

I would recommend rather than using a massive switch-case or if-else ladder try with computed gotos for two reasons.

1) Computed gotos are optimized by the compiler better than a massive switch-case lookup table

2) Even though "GOTOS ARE EVIL ZOMG" when used properly you can get good control flow out of it if you know what you are doing.

I would experiment with a parser generator on your problem but if you do want to milk it for efficiency then a hand-written one is much more tuneable than a canned / runtime solution, even if it seems more painful than it should.

>>

 No.9642

File: 1442825489081.gif (1000.44 KB, 450x327, hexshit.gif) ImgOps iqdb

>>9641

Thanks for the pointers.
The interpreter I think is solid enough to keep since it works great (as in good enough), it all comes down to a bunch of interchangeable FSMs.

Lexing+parsing though could use a good amound of work since its implementation is dirty as fucĸ.

re2c reminds me of Ragel, I think I'll use either of those for lexical analysis.

>>

 No.9644

>>9642
No worries!

on the note of re2c, people have been swapping that and lemon for the bison/flex pair,so its something to look into if you still want to use a canned alternative, although re2c is less canned than flex.

>>

 No.9663

>>9620
As an exercise, I would like to try to implement a "programming" language (I think more exactly a DSL) that should give the possibility to implement UIs in the most clean and efficient way.
The language should not compile to machine code or something. Instead, the compilation process should involve specifying a language and toolkit and the compiler would output files defining the UI in the given language.
The "programming" part should stay very minimalist.

I would like to implement this from scratch. Any recommendations except the dragon book ?

>>

 No.9665

>>9663
Modern Compilers in C (Linked Above) is a good start.

As for the "compilation" process, it seems like it's mostly a translation as opposed to compilation, so one would think that you have a template in the target language of sorts that you can just pop in values and then write out the resulting code.

>>

 No.9667

>>9641
>Tools like bison/flex are nice and all mainly due to the fact that you don't need to focus on the dispatch tables and such, seeing as they implement the lexing and parsing using one big state-machine.
It's still insane to think about this.

Every language I use has an incredibly simple syntax and whatnot.

>>

 No.9668

File: 1442886835205.jpg (3.73 MB, 6791x708, 1438973201142.jpg) ImgOps Exif iqdb

I've written a few "programming languages" a year or two ago, mostly languages that resembled lisp if it was written in reverse polish rotation since it was easy to parse. Played around with writing my own lexer / parser in C, and then made 2 different version of the language, one compiled into bytecode and run by an interpreter and the other processed into x86 ASM and fed to an assembler / linker.

That was a while ago and I haven't written anything or programmed in close to a year so I remember close to nothing and I'm so I'm rusty as fuarrrk.

Going to try and rewrite a lexer/parser in C and then write a lisp-like interpreted language as a practice, getting back into programming project.

Does anyone have a good resource for the specifications of an incredibly simple, barebones lisp that I could use as the basis of my language to build off of? I programmed in lisp 3 years ago but I don't remember much.

>>

 No.9670

>>9668
I'm not a lisper so I can't say but some searching found http://nakkaya.com/2010/08/24/a-micro-manual-for-lisp-implemented-in-c/

It seems to be more of a how-to than a spec though.

It does however link to this paper that seems to be more of a spec: https://drive.google.com/file/d/0B0ZnV_0C-Q7IOTRkNzVjZjMtMWE1NC00YzQ3LTgzMWEtM2UwY2I1YzdmNmM5/view?ddrp=1&pli=1&hl=en#

>>

 No.9671

>>9667
Yeah, it's crazy to think how intricate that soykaf is.

I found a paper from '68 that was a graduate thesis on implementing an efficient parser in n^3 or best case n^2.

I would think that ideally you can get it down to at least n, but hey it's complicated stuff.

Sauce:
http://ra.adm.cs.cmu.edu/anon/home/anon/usr/ftp/scan/CMU-CS-68-earley.pdf

>>

 No.9684

File: 1442908548591.png (52.28 KB, 931x832, complexity.png) ImgOps iqdb

>>9671
This inspired me to put my current project under a quick test. Could surely be a tad better.

Interesting stuff by the way, the part on prediction I find curious.

>>

 No.9716

I thought that some of you might be interested in this:
https://lobste.rs/s/pzzlwr/type_theory_reading_group_-_let_s_make_this_a_thing

It's a type theory reading group that is forming. If you are interested, you can signup here:
https://docs.google.com/forms/d/11MKWhx23VGMvtAzgpuaWuD97iXj4SALyIynslNX5INo/viewform

>>

 No.9725

I need to get back to learning more about compilers and interpreters. I created a little languages inspired by the C family of languages using LLVM. I have been spending a lot of time learning other programming languages, I can't believe just how much I have missed by not learning about these languages. So right now I am thinking about getting back to creating programming languages and to try to work on open source compilers and interpreters. My next project will probably be a scheme interpreter based on the R7RS standard.

>>

 No.9734

File: 1442982755778.gif (582.65 KB, 640x360, J3G0KkJ.gif) ImgOps iqdb

>>9725
That's good, working on projects is a good way to learn, I wish you the best.

I, myself, am working on developing a language, mainly for fun but also to try to be better than nim (seeing as I have an acquaintance who works on the project and it's some `friendly` competition).

So far I've only decided on the name (Atrophy) and some basic language features such as static typing and reflection. I still need to work out a sane syntax and some other language features. Still a fair deal of work do be done.

>>

 No.9735

>>9684
Curious, what did you use to make this plot? My first thought was R or GNUPlot.

>>

 No.9738

>>9684
I guess Y is time and X is size, but why don't you label them?

>>

 No.9741

>>9738
Lazyness.

>>9735
Gnumeric actually.

>>

 No.9742

>>9741
> Gnumeric actually.
Hrmm, that would make sense, more so than my ideas I suppose.

>>

 No.9746

>>9665
I don't understand you. Compilation is just a specific translation.

>>

 No.9747

>>9746
Yeah, but compilation also implies explicit optimization such as loop unrolling and such, at least that falls under my definition of a compiler.

At that point it's just a formality of terminology.

>>

 No.9761

File: 1443032471399.jpeg (18 KB, 320x320, i320.jpeg) ImgOps iqdb

>>9644
Thanks again on pointing in the right direction.

Just built a simple arithmetic parser, working great so far.

It's also the first time in years I used a stack to evaluate the resulting RPN of an expression, it's pretty swell.

>>

 No.9767

>>9761
Don't sweat it!

Stacks do indeed work, I'm more of a fan of trees, but each has its place.

Stacks would be more ideal for contiguous blocks of memory than trees though so faster access times.

>>

 No.9778

>>9767

Well to be honest I parse the expression into RPN using the shunting yard algorithm,
then create an abstract syntax tree by reverse iterating through it which then in turn gets evaluated later using a stack by traversing it postfix, which gets me exactly where I was before actually come to think of it.

Seems awful and contrived, especially since the code uses a dynamic upcast while visiting the nodes, but this is giving me more options I presume.

I just love multi-dispatches too much to not do such shít.
But hey, I can print trees infix and add new operations and leaf-types later, that must count for something.

>>

 No.9787

Here's an excellent paper that takes you through the process of writing a simple Scheme compiler:
http://scheme2006.cs.uchicago.edu/11-ghuloum.pdf

>>

 No.9788

Do you know anything good on compiling logic languages, like Prolog?

>>

 No.9792

File: 1443111134486.jpg (50.58 KB, 659x633, feels.jpg) ImgOps Exif iqdb

>>9788
Have a look at mercury's papers section.

https://www.mercurylang.org/documentation/papers.html

>>

 No.9793

>>9792
>if (no_gf == true)
>feels ++

kimochi warui

>>

 No.9794

File: 1443114078022.jpg (151.67 KB, 519x538, 1437265677595.jpg) ImgOps Exif iqdb

>>9792
Some really cool papers, thanks!

>>

 No.9802

File: 1443127141315.gif (950.04 KB, 500x484, 1434461938315.gif) ImgOps iqdb

>>9787
This is fuarrrk ing awesome! But the link at the end is dead. Do you know where can I find the extended tutorial and the test cases?

>>

 No.9853

File: 1443279486065.png (150.7 KB, 600x531, 1442993983634.png) ImgOps iqdb

>>9792
Those papers look really quite interesting, thanks for sharing.

>>

 No.9923

Hi. I developed an interpreter for a simple language, one or two years ago. Example:
number(x)
succ: x+1
pred: x-1

test: [
number(0).succ == 1,
number(0).pred == -1
]

The first line is short for
number(x): this


It behaves like the following JS code:
function() {
var number = function(x) {
var succ = function() { return x+1; };
var pred = function() { return x-1; };
var this_ = {"succ": succ, "pred": pred};
return this_;
};

var test = function() {
var this_ = {};
return [number(0).succ == 1, number(0).pred == -1];
};

var this_ = {"number": number, "test": test};
return this_;
}


The language has no way to express side-effects yet, so I imagine it to be embedded in a bigger language.

timeout(sec, cond): stateful(State(null), cond).timeouted
State(start)
impulse(i): State(start and i ? start : now)
timeouted: now - start >= sec


The idea here ^ is that cond is a data-source of true and false values. And `timeout(5, data)` will be `true` three seconds after the last `false` message on the channel.

>>

 No.9928

>>9923
What was the idea behind the syntax? It looks like python mixed with JSON.

Also, what was it implemented in? Did you roll your own parser/lexer or did you use a parser generator? So many questions so little information.

>>

 No.9931

>>9928
It was implemented in python. I made the parser using code from http://effbot.org/zone/simple-top-down-parsing.htm

It was the result of an attempt to have a compact DSL to describe game rules for FPS. From my notes:
obj ctf(game):   // obj is a macro, forgot what it was used for
world: game.world // shorthand
flags: map(flaglogic, game.teams) // ensures there's one flag per team

overlay player: // makes player.can_carry visible within 'cfg'
can_carry: alive // player.can_carry is the same as player.alive

obj flaglogic(team):
// helper function hold(f, g) will hold a value x from g as long as f(x) is true
// the VM will ensure that g is reevaluated only when f(x) makes a downwards transition
// $.attr is a syntax for λx: x.attr
// first(list, key) = sorted(list, key)[0]
carrier: hold($.can_carry, first(eligble_carriers, key=distance))
home: world.lookup((:ctf, "home", team.tag))
// wasn't sure whether to allow python list-comprehensions or use functions like 'filter'
eligible_carriers: filter($.can_carry, colliders)
provide transform:
carrier
? ( carrier.attachment("flag") or carrier.attachment("over head") )
: ( expired
? home.location
// for simplephysics the VM was supposed to take the derivative of the
// transform and continue moving the object along trajectory
: simplephysics(collisions) )

expired: timeout(10, not carrier)

>>

 No.9959

>>9931
That link looks really good, I can't wait to have a play with the code in that page. I've been doing a parser tutorial in Haskell but its hard to keep track of the concepts and learn a new language at the same time

>>

 No.10139

File: 1443981849383.gif (2.58 MB, 600x338, Ru77fq7.gif) ImgOps iqdb

Anyone happen to have a ACM SIGPLAN subscription? There are some papers there that would be nice to add to the resources list, but money.

>>

 No.10141

>>10139
Wouldn't sci-hub work?
http://sci-hub.club/

>>

 No.10142

>>10141
I had no idea about that, I was looking into https://scirate.com/ as well.

>>

 No.10148

>>10142
What papers are you looking for? I can help hunting them down.

>>

 No.10245

File: 1444223591679.gif (556.25 KB, 992x405, 2Pe6lht.gif) ImgOps iqdb

>>10148
It was just some things on LALR(1) and LL(*) Parsers and the like, I found some good ones, thanks anyway.

>>

 No.10928

> http://compcert.inria.fr/doc/index.html
> CompCert is a compiler that generates PowerPC, ARM and x86 assembly code from CompCert C, a large subset of the C programming language. The particularity of this compiler is that it is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that the generated assembly code is semantically equivalent to its source program --- was entirely proved within the Coq proof assistant.

>>

 No.11275

File: 1446343508231.jpg (50.83 KB, 850x638, 12003216_1683435058555344_….jpg) ImgOps Exif iqdb

>>10928
CompCert has had some undefined behaviour and compiler bugs as of late if I recall. But the fact it was written in Coq astounds me.

>>

 No.11323

>>10928
Damn awesome

I really want to get into these proving languages but I can't wrap my head around their concepts. (Looking into Microsoft's LEAN right now)

For reverse engineering for example, it would be great to prove that the reconstructed source code does the same as the original

I can imagine jobs in that area to be very well paid.

Also, >Coq Present Day, Present Time! AHAHAHAHAHA!

>>

 No.11324

>>11323 (cont.) so it seems the ebin meme face gets word-filtered. well that's fine with me.

>>

 No.11344

File: 1446470449119.png (276.2 KB, 650x600, coq-tan.png) ImgOps iqdb


>>

 No.11345

>>11323
Lean is cool. Its syntax is very similar to Coq's and it's even built on the same logic.

>For reverse engineering for example, it would be great to prove that the reconstructed source code does the same as the original

I'm not sure this would be the right job for manual verification. You tend to run into a lot of repetitive and tedious proofs when proving these things from the bottom up. It's a different matter when you already have high-level models of both sides and want to prove specific equivalences.

What you want is to prove functional correspondence of two programs. Depending on the level of abstraction, what you want is something like symbolic execution or model checking, which is a mostly automated (depending on how you obtain your model) method for proving correctness of programs. There are many tools for this, too.

One of them is SAW (http://saw.galois.com/index.html), which uses SMT and symbolic execution to verify properties of C and Java programs. The guys from Galois, Inc. had a talk recently about verification of cryptographic functions using SMT; the slides are on GitHub, for those interested: https://github.com/GaloisInc/sat2015-crypto

>I can imagine jobs in that area to be very well paid.

I can confirm. They're also not the easiest and you must know tons of theory to be efficient and innovative, though.

>>11344
Here's another one: http://ilyasergey.net/pnp/
I have tons of links to other resources, but I'd be too embarrassed to publicize them in their current form.

>>

 No.11346

>>11345
BTW, I was considering writing a short article (perhaps even series) about some topic from this area, but I'm not sure how useful it would be given the amount of good resources that have already been written.

>>

 No.11348

>>11346
That would be really cool. Maybe you could get it into the lainzine?

>>

 No.11349

>>11348
Oh of course, I meant to write "a short article for the zine", but it got lost somewhere between editing.

>>

 No.11354

>>11345
Considering that reconstructing C source is already tedious and has to done function by function (even with help of a decompiler) adding proofs seems like a linear overhead.
I don't know the exact distinction between all these terms (model checking, etc.) but I think you can always convert there to a proof later.

I imagine that one could develop tools to automate this work: Mark a few correspondences in source code and binary flow graph, or source variables and registers, and let some algorithm figure out the rest.

>One of them is SAW (http://saw.galois.com/index.html), which uses SMT and symbolic execution to verify properties of C and Java programs.

I wonder how well it copes with memory accesses (cryptographic function are mostly pure).



Delete Post [ ]
[ cyb / tech / λ / layer ] [ zzz / drg / lit / diy / art ] [ w / rpg / r ] [ q ] [ / ] [ popular / ???? / rules / radio / $$ / news ] [ volafile / uboa / sushi / LainTV / lewd ]