Videos
We will show the more general case, i.e.:
$\| \cdot \|_1$ , $\| \cdot \|_2$, and $\| \cdot \|_{\infty}$ are all equivalent on $\mathbb{R}^{n}$. And we have $$\| x \|_{\infty} \leq \| x \|_{2} \leq \| x \|_{1} \leq n \| x \|_{\infty}\ $$
Every $x \in \mathbb{R}^{n}$ has the representation $x = ( x_1 , x_2 , \dots , x_n )$. Using the canonical basis of $\mathbb{R}^{n}$, namely $e_{i}$, where $e_i = (0, \dots , 0 , 1 , 0 , \dots , 0 )$ for $1$ in the $i^{\text{th}}$ position and otherwise $0$, we have that $$\| x \|_{\infty} = \max_{1\leq i \leq n} | x_i | = \max_{1\leq i \leq n} \sqrt{ | x_i |^{2} } \leq \sqrt{ \sum_{i=1}^{n} | x_ i |^{2} } = \| x \|_2 $$ Additionally, $$ \| x \|_2 = \sqrt{ \sum_{i=1}^{n} | x_i |^{2} } \leq \sum_{i=1}^{n} \sqrt{ | x_ i |^{2} } = \sum_{i=1}^{n} |x_i| = \| x \|_1$$ Finally, $$ \| x \|_1\ = \sum_{i=1}^{n} |x_i| \leq \sum_{i=1}^{n} | \max_{1 \leq j \leq n} x_j | = n \max_{1 \leq j \leq n} | x_j | = n \| x \|_{\infty}$$ showing the chain of inequalities as desired. Moreover, for any norm on $\mathbb{R}^{n}$ we have that: $$\| x - x_{n} \|\ \to 0 \hspace{1cm} \text{as} \space\ \space\ n \to \infty $$ so that they are equivalent, as this holds for any $x \in \mathbb{R}^{n}$ under any norm.
The inequality $ \|x\|_1 \leq \sqrt{n} \,\|x\|_2 $ is a consequence of Cauchy-Schwarz. To see this
$$\sqrt n\, \|x\|_2 =\sqrt{1+1+\cdots+1}\,\sqrt{\sum_{i} x_i^2 }\geq \|x\|_1$$
For the first, the function $f(x)=\sqrt{x}$ is concave and $f(0)=0$, hence $f$ is subadditive
Therefore $ f(\sum_{i} x_i^2 )\leq \sum_{i} f(x_i^2) =\|x\|_1 $
OK let's see if this helps you. Suppose you have two functions $f,g:[a,b]\to \mathbb{R}$. If someone asks you what is distance between $f(x)$ and $g(x)$ it is easy you would say $|f(x)-g(x)|$. But if I ask what is the distance between $f$ and $g$, this question is kind of absurd. But I can ask what is the distance between $f$ and $g$ on average? Then it is $$ \dfrac{1}{b-a}\int_a^b |f(x)-g(x)|dx=\dfrac{||f-g||_1}{b-a} $$ which gives the $L^1$-norm. But this is just one of the many different ways you can do the averaging: Another way would be related to the integral $$ \left[\int_a^b|f(x)-g(x)|^p dx\right]^{1/p}:=||f-g||_{p} $$ which is the $L^p$-norm in general.
Let us investigate the norm of $f(x)=x^n$ in $[0,1]$ for different $L_p$ norms. I suggest you draw the graphs of $x^{p}$ for a few $p$ to see how higher $p$ makes $x^{p}$ flatter near the origin and how the integral therefore favors the vicinity of $x=1$ more and more as $p$ becomes bigger. $$ ||x||_p=\left[\int_0^1 x^{p}dx\right]^{1/p}=\frac{1}{(p+1)^{1/p}} $$ The $L^p$ norm is smaller than $L^m$ norm if $m>p$ because the behavior near more points is downplayed in $m$ in comparison to $p$. So depending on what you want to capture in your averaging and how you want to define `the distance' between functions, you utilize different $L^p$ norms.
This also motivates why the $L^\infty$ norm is nothing but the essential supremum of $f$; i.e. you filter everything out other than the highest values of $f(x)$ as you let $p\to \infty$.
There are several good answers here, one accepted. Nevertheless I'm surprised not to see the $L^2$ norm described as the infinite dimensional analogue of Euclidean distance.
In the plane, the length of the vector $(x,y)$ - that is, the distance between $(x,y)$ and the origin - is $\sqrt{x^2 + y^2}$. In $n$-space it's the square root of the sum of the squares of the components.
Now think of a function as a vector with infinitely many components (its value at each point in the domain) and replace summation by integration to get the $L^2$ norm of a function.
Finally, tack on the end of last sentence of @levap 's answer:
... the $L^2$ norm has the advantage that it comes from an inner product and so all the techniques from inner product spaces (orthogonal projections, etc) can be applied when we use the $L^2$ norm.
I think I understand your question - typically $\|A\|_2$ has two definitions $$ \|A\|_2 = \sqrt{\text{largest eigen value of } A^{\ast}A} $$ and $$ \|A\|^{\prime}_2 = \sup_{\|x\|_2 = 1}\|Ax\|_2 $$ Note that $B = A^{\ast}A$ is a symmetric, positive matrix ($\langle Bx,x \rangle \geq 0$), and hence it can be diagonalized, and its eigen values are non-negative. Write them as $$ \lambda_1 \geq \lambda_2 \geq \ldots \geq \lambda_n \geq 0 $$ and consider an orthonormal basis $\{u_1, u_2, \ldots, u_n\}$ such that $$ Bu_i = \lambda_i u_i $$ For any $x \in \mathbb{R}^n$, write $x = \sum \alpha_i u_i$, then $$ \|x\|_2 = 1 \Leftrightarrow \sum_{i=1}^n \alpha_i^2 = 1 $$ So consider $$ \|Ax\|^2_2 = \langle Ax,Ax\rangle = \langle A^{\ast}Ax,x\rangle = \sum_{i=1}^n \lambda_i \alpha_i^2 $$ Hence, it follows that $$ \|Ax\|^2_2 \leq \lambda_1 $$ and hence $\|A\|^{\prime}_2 \leq \sqrt{\lambda_1} = \|A\|_2$
The other inequality is obvious - can you see that?
The operator norm of a matrix depends on the norm that you are putting on the vector space on which it acts. So for example if the matrices we consider are in $M_n(\mathbb{R})$ then they are acting on $\mathbb{R}^n$. Given a norm $\| \cdot \|_\alpha$ on $\mathbb{R}^n$ we can create an associated norm (operator norm associated to $\alpha$) by defining
$\|A\|_{(\alpha)}=\sup_{v\neq0}\frac{\|Av\|_\alpha}{\|v\|_\alpha}$.
In the case that we are using the usual euclidean 2-norm for $\mathbb{R}^n$ then the associated $\|\cdot\|_{(2)}$ is what you describe above.
However, we also know that $M_n(\mathbb{R})$ is a vector space isomorphic to $\mathbb{R}^{n^2}$ and so we can put a norm on it this way. These are the $\|A\|_q$ norms that you mention.
In particular $\|A\|_2=\left(\sum|a_{ij}|^2\right)^{\frac{1}{2}}\neq \|A\|_{(2)}$ in general.
Take for example $A=\left(\begin{array}{cc} 1 & 0 \\ 0 & 2 \end{array}\right)$
Then $\|A\|_{(2)}=2$ and $\|A\|_2=\sqrt{5}$
Finally just like any finite dimensional space we have $\|A\|_1\geq \|A\|_2$. Further for finite dimensional spaces all norms give the same topology. This implies that for any two norms and any $A$ in the space there are constants $k_1, k_2>0$ such that $\|A\|_\alpha\geq k_1\|A\|_\beta\geq k_2\|A\|_\alpha$.
Thus there is some constant such that $\|A\|_2\geq k\|A\|_1$ but I don't know what it is off the top of my head. (and it should depend on $n$...maybe $\sqrt{n}$?)