I am in my third programing class in college, and we have only done C++ up to this point. My current class is on Java, and we are learning about how C++ is compiled and Java has a mixture of both compilation and interpretation. Some people are having a lot of difficulty with this idea, especially coming from JavaScript.
I understand that for a language to be interpreted, it means it is read line-by-line, converted, and executed right away; for a langauge to be compiled, it means that the entire source code is translated and the entire file is executed.
My questions:
But in practice (i.e. as in actual writing code and developing software), how are compiled languages and interpreted languages different?
Also, how is Java both? I haven't noticed any differences from using Java to using C++.
Definitions of “interpreted language” and “compiled language” with explanations of why Python and Java are or are not such languages - Python in Education - Discussions on Python.org
Examples of when we'll use interpreted language over compiled language? - Software Engineering Stack Exchange
programming languages - Interpreted vs Compiled: A useful distinction? - Software Engineering Stack Exchange
java - What's the difference between compiled and interpreted language? - Stack Overflow
Videos
A compiled language is one where the program, once compiled, is expressed in the instructions of the target machine. For example, an addition "+" operation in your source code could be translated directly to the "ADD" instruction in machine code.
An interpreted language is one where the instructions are not directly executed by the target machine, but instead read and executed by some other program (which normally is written in the language of the native machine). For example, the same "+" operation would be recognised by the interpreter at run time, which would then call its own "add(a,b)" function with the appropriate arguments, which would then execute the machine code "ADD" instruction.
You can do anything that you can do in an interpreted language in a compiled language and vice-versa - they are both Turing complete. Both however have advantages and disadvantages for implementation and use.
I'm going to completely generalise (purists forgive me!) but, roughly, here are the advantages of compiled languages:
- Faster performance by directly using the native code of the target machine
- Opportunity to apply quite powerful optimisations during the compile stage
And here are the advantages of interpreted languages:
- Easier to implement (writing good compilers is very hard!!)
- No need to run a compilation stage: can execute code directly "on the fly"
- Can be more convenient for dynamic languages
Note that modern techniques such as bytecode compilation add some extra complexity - what happens here is that the compiler targets a "virtual machine" which is not the same as the underlying hardware. These virtual machine instructions can then be compiled again at a later stage to get native code (e.g. as done by the Java JVM JIT compiler).
A language itself is neither compiled nor interpreted, only a specific implementation of a language is. Java is a perfect example. There is a bytecode-based platform (the JVM), a native compiler (gcj) and an interpeter for a superset of Java (bsh). So what is Java now? Bytecode-compiled, native-compiled or interpreted?
Other languages, which are compiled as well as interpreted, are Scala, Haskell or Ocaml. Each of these languages has an interactive interpreter, as well as a compiler to byte-code or native machine code.
So generally categorizing languages by "compiled" and "interpreted" doesn't make much sense.
There's (to my knowledge) no such thing as an interpretted "language" or a compiled "language".
Languages specify the syntax and meaning of the code's keywords, flow constructs and various other things, but I am aware of no language which specifies whether or not it must be compiled or interpreted in the language spec.
Now if you're question is when you use a language compiler vs a language interpreter, it really comes down to the pro's/con's of the compiler vs. the interpreter and the purpose of project.
For instance, you may use the JRuby compiler for easier integration with java libraries instead of the MRI ruby interpreter. There are likely also reasons to use the MRI ruby interpreter over JRuby, I'm unfamiliar with the language though and can't speak to this.
Touted benefits of interpreters:
- No compilation means the time from editing code to testing the app can be diminished
- No need to generate binaries for multiple architectures because the interpreter will manage the architecture abstraction (though you may need to still worry about the scripts handling integer sizes correctly, just not the binary distribution)
Touted benefits of compilers:
- Compiled native code does not have the overhead of an interpreter and is therefore usually more efficient on time and space
- Interoperability is usually better, the only way for in-proc interoperation with scripts is via an interpreter rather than a standard FFI
- Ability to support architectures the interpreter hasn't been compiled for (such as embedded systems)
However, I would bet in 90% of cases it goes something more like this: I want to write this software in blub because I know it well and it should do a good job. I'll use the blub interpreter (or compiler) because it is the generally accepted canonical method for writing software in blub.
So TL;DR is basically, on a case by case basis comparison of the interpreters vs the compilers for your particular use case.
Also, FFI: Foreign Function Interface, in other words interface for interoperating with other languages. More reading at wikipedia
An important point here is that many language implementations actually do some sort of hybrid of both. Many commonly used languages today work by compiling a program into a intermediate format such as bytecode, and then executing that in an interpreter. This is how Java, C#, Python, Ruby, and Lua are typically implemented. In fact, this is arguably how most language in use today are implemented. So, the fact is, language today both interpret and compile their code. Some of these languages have an additional JIT compiler to convert the bytecode to native code for execution.
In my opinion, we should stop talking about interpreted and compiled languages because they are no longer useful categories for distinguishing the complexities of today's language implementations.
When you ask about the merits of interpreted and compiled languages, you probably mean something else. You may be asking about the merit of static/dynamic typing, the merits of distributing native executables, the relative advantages of JIT and AOT compilation. These are all issues which get conflated with interpretation/compilation but are different issues.
It's important to remember that interpreting and compiling are not just alternatives to each other. In the end, any program that you write (including one compiled to machine code) gets interpreted. Interpreting code simply means taking a set of instructions and returning an answer.
Compiling, on the other hand, means converting a program in one language to another language. Usually it is assumed that when compilation takes place, the code is compiled to a "lower-level" language (eg. machine code, some kind of VM bytecode, etc.). This compiled code is still interpreted later on.
With regards to your question of whether there is a useful distinction between interpreted and compiled languages, my personal opinion is that everyone should have a basic understanding of what is happening to the code they write during interpretation. So, if their code is being JIT compiled, or bytecode-cached, etc., the programmer should at least have a basic understanding of what that means.
The distinction is deeply meaningful because compiled languages restrict the semantics in ways that interpreted languages do not necessarily. Some interpretive techniques are very hard (practically impossible) to compile.
Interpreted code can do things like generate code at run time, and give that code visibility into lexical bindings of an existing scope. That's one example. Another is that interpreters can be extended with interpreted code which can control how code is evaluated. This is the basis for ancient Lisp "fexprs": functions that are called with unevaluated arguments and decide what to do with them (having full access to the necessary environment to walk the code and evaluate variables, etc). In compiled languages, you can't really use that technique; you use macros instead: functions that are called at compile time with unevaluated arguments, and translate the code rather than interpreting.
Some language implementations are built around these techniques; their authors reject compiling as being an important goal, and rather embrace this kind of flexibility.
Interpreting will always be useful as a technique for bootstrapping a compiler. For a concrete example, look at CLISP (a popular implementation of Common Lisp). CLISP has a compiler that is written in itself. When you build CLISP, that compiler is being interpreted during the early building steps. It is used to compile itself, and then once it is compiled, compiling is then done using the compiled compiler.
Without an interpreter kernel, you would need to bootstrap with some existing Lisp, like SBCL does.
With interpretation, you can develop a language from absolute scratch, starting with assembly language. Develop the basic I/O and core routines, then write an eval, still machine language. Once you have eval, write in the high level language; the machine code kernel does the evaluating. Use this facility to extend the library with many more routines and write a compiler also. Use the compiler to compile those routines and the compiler itself.
Interpretation: an important stepping stone in the path leading to compilation!
What’s the difference between compiled and interpreted language?
The difference is not in the language; it is in the implementation.
Having got that out of my system, here's an answer:
In a compiled implementation, the original program is translated into native machine instructions, which are executed directly by the hardware.
In an interpreted implementation, the original program is translated into something else. Another program, called "the interpreter", then examines "something else" and performs whatever actions are called for. Depending on the language and its implementation, there are a variety of forms of "something else". From more popular to less popular, "something else" might be
Binary instructions for a virtual machine, often called bytecode, as is done in Lua, Python, Ruby, Smalltalk, and many other systems (the approach was popularized in the 1970s by the UCSD P-system and UCSD Pascal)
A tree-like representation of the original program, such as an abstract-syntax tree, as is done for many prototype or educational interpreters
A tokenized representation of the source program, similar to Tcl
The characters of the source program, as was done in MINT and TRAC
One thing that complicates the issue is that it is possible to translate (compile) bytecode into native machine instructions. Thus, a successful intepreted implementation might eventually acquire a compiler. If the compiler runs dynamically, behind the scenes, it is often called a just-in-time compiler or JIT compiler. JITs have been developed for Java, JavaScript, Lua, and I daresay many other languages. At that point you can have a hybrid implementation in which some code is interpreted and some code is compiled.
Java and JavaScript are a fairly bad example to demonstrate this difference, because both are interpreted languages. Java (interpreted) and C (or C++) (compiled) might have been a better example.
Why the striked-through text? As this answer correctly points out, interpreted/compiled is about a concrete implementation of a language, not about the language per se. While statements like "C is a compiled language" are generally true, there's nothing to stop someone from writing a C language interpreter. In fact, interpreters for C do exist.
Basically, compiled code can be executed directly by the computer's CPU. That is, the executable code is specified in the CPU's "native" language (assembly language).
The code of interpreted languages however must be translated at run-time from any format to CPU machine instructions. This translation is done by an interpreter.
Another way of putting it is that interpreted languages are code is translated to machine instructions step-by-step while the program is being executed, while compiled languages have code has been translated before program execution.