-ip 192.168.33.12 consul://192.168.33.10:8500/galera
$ docker run -d --name node2
-h node2 erkules/galera:basic
--wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1
Create an Nginx configuration file that acts as a load-balancer for these two applica‐
tions. Assuming you decide to run the load-balancer on the Docker host with IP
192.168.33.11
, the following example will work:
events {
worker_connections 1024;
}
http {
upstream galera {
server 192.168.33.11:3306;
server 192.168.33.12:3306;
}
server {
listen 80;
location / {
proxy_pass http://galera;
}
}
}
Next start your Nginx container, binding port 80 of the container to port 80 of the
host, and mount your configuration file inside the container.
Give your container a
name, as this will prove handy later:
$ docker run -d -p 3306:3306 -v /home/vagrant/nginx.conf:/etc/nginx/nginx.conf
--name galera nginx
Test that your load-balancing works. Then head back to
Recipe 10.3
and use the same
steps presented there. Use
confd
to automatically reconfigure your nginx configura‐
tion template when you add MySQL containers.
10.7 DATA: Creating a Spark Cluster
Problem
You are looking for a data-processing engine that can work in parallel for fast compu‐
tation and access to large datasets. You have settled on
Apache Spark
and would like
to deploy it using containers.
Solution
Apache
Spark
is an extremely fast data-processing engine that works at large scale
(for a large number of worker nodes) and that can also handle a large amount of data.
10.7 DATA: Creating a Spark Cluster | 323
With Java, Scala, Python,
and R interfaces, Spark is a great tool to program complex
data-processing problems.
A Spark cluster can be deployed in
Kubernetes
, but with the development of Docker
Network, the Kubernetes deployment scenario can be used almost as is. Indeed,
Docker Network (see
Recipe 3.14
) builds isolated networks
across multiple Docker
hosts, manages simple name resolution, and exposes services.
Hence to deploy a Spark cluster, you are going to use a Docker network and then do
the following:
• Start a Spark master by using the image available on
the Google registry and used
by the Kubernetes example.
• Start a set of Spark workers by using a slightly modified
image
from the Google
registry.
The worker image uses a start-up script that hardcodes the Spark master port to
7077
instead of using an environment variable set by Kubernetes.
The image is available on
Docker Hub
and you can see the start-up script on
GitHub
.
Let’s start a master, making sure that you define the hostname
spark-master
:
$ docker run -d -p 8080:8080 --name spark-master -h spark-master gcr.io/ \
google_containers/spark-master
Now let’s create three Spark workers. You could create
more and create them on any
hosts that are on the same Docker network:
To avoid crashing your nodes and/or containers, limit the memory
allocated to each Spark worker container. You do this with the
-m
option of
docker run
.
$ docker run -d -p 8081:8081 -m 256m --name worker-1 runseb/spark-worker
$ docker run -d -p 8082:8081 -m 256m --name worker-2 runseb/spark-worker
$ docker run -d -p 8083:8081 -m 256m --name worker-3 runseb/spark-worker
You might have noticed that you exposed port 8080 of the
Spark master container on
the host. This gives you access to the Spark master web interface. As soon as the
Spark master container is running, you can access this UI. After the workers come
online, you will see
them appear in the dashboard, as shown in
Figure 10-6
.
Do'stlaringiz bilan baham: