System design: Web monitoring tool
Articles,  Blog

System design: Web monitoring tool

Today we have a very interesting
interview question that is like how to make sure a web application is healthy
how do you monitor the web application or sometimes you can be asked a question
to design a monitoring tool for your web application let’s see how to do this why
do we monitor the web application right so however you do a lot of load testing
or performance testing you have to monitor the web application in the real
time for multiple reasons right so let’s see one by one right so you want to know
how many users are visiting your page that is highly visited web page right
that’s one reason and let’s say you want to know how many users are logged into
the system or you might be even interested to know the browser type
whether it’s an Internet Explorer or Google Chrome or Safari right and you
may also be interested to know the bandwidth of the system for example you
want to know how many data is transacted through the each port of the system and
you also need to know the CPU spikes disk out of space let’s see if the CPU
spikes your system can get scratched right also the disk minute then there
are hard disk runs out of the space your system could get crashed right so you
have to monitor this and make sure they get appropriate alert to take
appropriate decisions right and we also need to set up a write alerting
mechanism where if the if any of the issues are happening in the system it
has to be alerted to the production support team so that take they can take
a necessary action to make sure the system is running in this mode and there
are a lot of reasons you have to alert hearts or a lot of reasons you have to
monitor the system and there are and I have not discussed much on that and now
let’s see what are the real ingredients needed for monitoring so basically you
need a web server logs if you have a Tomcat server of AB logic server you
might be getting a two important logs one is the access lock and the error
logs the error logs are going to have the error information and access logs
are going to have the information related to the number of requests URLs
the session information and the web pages you are calling etcetera right or
the web services you are calling so we need those information to build a right
monitoring system right and you can also use beacons the bacons are nothing but
nothing but an image which is deployed in a separate
and you add the link to that server or divide the link to that image into all
the web pages and instead of monitoring in your Tomcat locks or the WebLogic
clock you can monitor or get a lot of information from the server where the
web economy just deployed right that is one of the way we can do it but that
will not give a lot of information but still you know if you want you can use
it and get some more information rather than getting from the Tomcat logs are
the web server logs and you’ll also have a load balancer logs the load balancer
log also provides lot of information like to which server the transaction is
happening and how the station stickiness is happening and these informations are
also really required to set it aware is set a monitoring system or a monitoring
tool and the next one is bandwidth C if you want to have an if you want to know
about the bandwidth of the system the protocol available is SNMP which is
simple network management protocol and you can use the protocol and there are a
lot of Java IP is available and you can use those protocol to connect with your
different servers and from that server you can get informations like CPU
capacity the disk space the bandwidth information etc and it can be stored in
a different file or it’s stored in a database and later that information can
be used for monitoring right and let’s say most of the applications to be
having a MySQL if you want to get information related to the MySQL you
have a MySQL logs and based on appropriate configuration you will get
informations like what is the query time which is taking to load fast or which is
taken to load slow and you can also see how many databases are connected or how
many slaves are configured etcetera and there are a lot of common comments like
show status where you will get information about MySQL and this
information can be passed to the different different file system where
you can parse those files and get the information to build the analytical tool
or a monitoring system right and there is also me if you have a memcached a
memcache installed in your application you have a memcache that status command
which I showed on the screen and this can be used to get the memcache
information and this information also can be used to build the monitoring to
write and even though you have all those informations you also need to
right and appropriate alerting mechanism so that you know if there is an issue
happening across your system you can alert it to the appropriate production
support or production engineering team to take an appropriate actions line so
here I am using a tool called Negus where it can use to configure and with
your system so that you know it can send the alerting based on your configuration
for example if your speed CPU goes beyond a certain limit you want to send
an alert right so you can you can all configure based on whatever you need
right okay now let’s go into the important topic how to design or how to
set up a alerting system on your own so let’s say if you are deployed your
application in AWS you have a straightforward tool or a direct tool
like you know cloud watch where you can configure everything and get those
information but here the question is you are going to design or setup a system on
your own without utilizing any of the directly available cloud cloud watching
or cloud monitoring tools right so let’s see how we are doing this if you look at
at the system there’s a very simple web application where you have a clients on
the left side and there and then you have a load balancer and the load
balancer in turn connects to multiple app servers probably the Tomcat server
of a project server and all these are connecting back into the SQL database
this is a very simple system let’s see how we set up a monitoring monitoring
tool or an alerting mechanism into that right so now if you see in the system
you will you’ll most probably have a logs in all the Tomcat servers right so
if you want to use all the logs we cannot it’s very difficult for us to
query if each and every Tomcat and get the log information instead what I’m
doing here is I am sending all the log messages to the q and q in turn will
connect into a different machine and where your all your log information of
log files are stored after that what happens here is we can use the log
information which has been collected in that machine and you can parse it and
extract to get an appropriate monitoring information like your a chart or a graph
whatever it is you can get it in that you can get whatever we discussed
earlier right like how many pages are new how many errors are happening it’s
can be fished by querying the logs from this log server right also if you see
you are you’ll be getting a logs under the SQL Server right we can we can even
fetch those logs and pass into the systems like ganglia where it helps to
parse the logs and provider to provide analytical reports out of that so
ganglia is an out of scope in this but it does it is an excellent monitoring
tool if you want to set up on your own right and this ganglia tool can also be
directly connected for the different VMs where the Tomcat server is running and
you can get more information like you know the bandwidth etc by connecting to
thee if an empiric all right so you can see that in the picture how I how I am
how the connections are happening between the different servers to collect
the appropriate informations like so now finally we have to set up an alerting
mechanism to send an appropriate alert to different production engineering
group right so here I’m not using it to call negress what it does is we can
directly integrate that tool with the ganglia and based on the configurations
it can send the appropriate alert right so for example if I speak to CPU spiking
or if it is going beyond a 70% capacity then you want to send an alerting
message or your hardware is going beyond a capacity of 90% then you want to send
an alerting message right so that the appropriate persons can pitch in to that
and have an extra hardware added added into that right so this is what I am
setting up an aqueous and you can configure that and you can have
appropriate alerting masae messages so what we discussed over is a very simple
system design for setting up your own alerting mechanism or setting up your
own monitoring tool right thanks for watching have a great day bye

Leave a Reply

Your email address will not be published. Required fields are marked *