Q1. To be able to build prediction models of botnet attacks, we had to decide information requirements. (a) What process did you use for identifying predictor variables for modelling botnet traffic flows? (b) What additional measures can be taken in terms of information (or data) gathering to further improve the accuracy of the prediction models?
(a) I started with a literature survey and read articles to understand the problem and technical terms. To understand the cause-effect relationships, each variable was individually explored. Ricardo and Raihan did a good job in preparing the data flows and explain the dataset and the derivation process from the Wireshark data. For the purpose of the analytical modelling, …show more content…
Each team was provided with credentials to access the VM. What happens in case the credentials are compromised? What will be the impact and what steps can we take to ensure that the confidential information is not leaked? There should to be more than a single level of security in this case. For example, R code/data should be encrypted. Next step will be to interview the information owners. The information assets will be grouped depending on the business needs and not on their technology requirements because each asset may contain items that need multiple technology solutions to address the same business need. In the botnet project, it may be possible that the data could be present within two different assets like the virtual machine and the cloud based server. However this can lead to conflicts of ownership and control. I will define clear rules about the retention schedules of these assets and how they operate at these different levels like location of asset, its owner, the volume of data to be stored, whether the data contains personal information like SSN, medical history etc. and how is it