This is the first time we have put our monthly report on line. We would love feedback on changes you would like to see.
John Holley, Head of ICTS.
ICTS Service Level Agreement Performance
Incident and Service Request Reporting
Gallup Strengths training completed for all Systems staff
ICTS Project Update
Ricoh Invoice Automation
ICTS has requested assistance from the vendor to enable the API for direct posting into TechnologyOne. TechnologyOne has agreed to help ICTS in mid-December with this work.
Further updates will be provided in the next report.
BI Platform Implementation
ICTS has submitted a business case requesting funding to implement a BI platform called Qlik Sense. Qlik Sense is the same platform used by the TEC and provides a very user friendly, dynamic interface to model and display data.
Once approved, ICTS hopes to have this system implemented before end of 2016.
Roll-out of Skype for Business to end-users now complete apart from MIT Enterprise. Remaining tasks before switching off Mitel are replacing the fax capability, migrate analogue phones to AudioCode and replacement of Contact Centre.
Migration of WAN & Internet Services to 2Degree’s
All complete apart from Middlemore due to Asbestos issues. Now scheduled for 1st December.
MIT Enterprise Network Upgrade
Progressing according to project plan.
Network Management & Monitoring – Product selected and approved. SOW underway and equipment ordered.
2016 CAPEX Device replacement
2016 replacement plan has been produced from our Service Desk asset database, with replacing oldest equipment first and sent to Faculty and Service Centre Managers.
Some time has been spent updating the database and confirming locations of equipment.
We are now well into the rollout, with about 50% of the new equipment deployed. The focus has been on staff as they are not available in their offices from mid December.
The plan is then to move onto the classroom machines in December.
Skype for Business Training
The training plan has been completed as scheduled. Unfortunately the attendance was not as we expected and many seemed to have missed the opportunity. There will be one last chance as Enterprise MIT will be the last to train and cut over in January 2017.
Service Delivery will be revisiting work areas to remove the old MITEL phones and expect this to be completed by 16th of December.
BA work completed and migration due in December
Backup & Archiving
Business case approved and equipment ordered
Replacement Contact Centre Software
Evaluation team have short-listed two products and reference checking due early December.
- Altiris agent installed on all servers (replacing KACE)
- Upgraded Altiris to latest version on 11th November
- Upgraded network switch environment in C Block on 12th November
- MitMedia upgraded on 15th November
- HEAT upgraded on 16th November
- Continued with network switch replacement program throughout November.
- Critical database patching actioned on 19th November
- Continued with Patching program throughout November including Skype and Oracle environments.
- EBS Upgrade 4.23 applied 26th November
- Further upgrade of HEAT on 30th November
The summary of threats detected and prevented by MIT firewalls:
ICTS Service Level Agreement Performance
Enterprise Services Availability
All Enterprises Services were at 99.5% or higher availability for November.
Services availability is measured on total uptime (excluding planned scheduled outage windows) minus any unscheduled outages.
The Availability measurement for services listed is 7.30am to 6.00pm, 5 days a week apart from Voyager and this measurement is based on 7.30 to 9.00pm, 7 days a week. Core Infrastructure Services (Compute and Network) are available 24 hours, 7 days a week.
Major Infrastructure Outages/Incidents during November 2016
There was one partial service outage during SLA hours in November.
Issues during November:
- Issues with Mailbox servers 3 & 4 (two out of a total of six) on the 7th November would have caused access issues for some users.
- Network slowdown across the enterprise network on Tuesday 8th November. Identified two security cameras flooding the network. As a result, we had to apply an urgent software upgrade and recycle firewalls to clear bottleneck. This took approximately 5 hours to fully clear traffic and return to normal throughput. See Appendix 1
- Patching of Skype environments caused problems with response groups set-up on the 21st November. Applied further patches to fix issue.
- Access to Skype for some users was slow on the morning of the 23rd November. Resolved by 8.00am.
- Network issues between Manukau and Otara on the 24th November. Reason traced to DCI switches and was resolved.
- Numerous issues encountered on Monday 28th November followed the EBS upgrade. Individual problems identified and resolved over the next two business days.
- Skype issues for some users on 30th November following incorrect changes in preparation for last user rollout. Changes corrected and impacted users operational within 2 hours.
*Antivirus Coverage information for November is not available at this stage*
Average size of Email boxes/storage
Incident and Service Request Reporting
Actual Incident Summary
Actual Service Request Summary
Number of Incidents by Faculty
Service Level Targets
ICTS MAJOR INCIDENT REPORT
Date: 8 November 2016
Owner: Rajinesh Shankar
|Start of Outage||9:15am|
|End of Outage||2:00pm|
|Total Outage||4h 45min|
|Services Impacted; Refer to ICTS Service Catalogue||Total or Partial Service Impact|
|1.Internet Services Otara and Remotes sites, except Manukau||Partial|
|2.Wireless Access and Remotes sites, except Manukau||Partial|
|1.IPS engine on Fortigate Firewall at Otara|
|2.Cameras in MIT network (especially Maritime)|
|3.Identified Infected computer at Maritime|
|Description of Incident|
|Problem was noticed when Nagios identified that the CPU on the Trusted Fortigates were maxed out at 100%. The usual case is always between 10-30% of CPU. The Fortigates were looked at and it was identified that the cause was due to the high number of sessions processing by the Fortigates, which were around 225196 sessions. This again was not normal, as the ideal busy day would have a maximum of 4500-51000 sessions only. Due to the Fortigates processing a very high number of sessions and CPU 100%, there were hardly any successful connections going in and out of the Fortigates.
We had issues as below between 9:15am to 2.00pm:
– Intermittent internet access for Otara and Remote Sites, except Manukau
– Wireless access issues for Otara and Remote Sites, except Manukau
|Root Cause Analysis & Restoration Action|
|Intense investigation was done from the Fortigates back into the network. On that day, three CCTV cameras with default login details at our Maritime site in Union House flooded the network with traffic. We secured them by changing their passwords and rebooting them to clear the malicious code running on them. There was one domain computer identified at Maritime as well which was remotely taken off the network. This machine has been scanned and cleaned.
The high CPU cycles and sessions were still happening after rebooting the Fortigates. So, the South Campus Fortigate was promoted as the master Fortigate. This did not help either. With further investigation on the Fortigates via command line, it was identified that the IPS Engine was causing the high CPU cycles. Killing the IPS Engine services did not help as well. Call with Fortigate Support was logged and a remote session established for Fortigate to have a look. The result ended up in upgrading the IPS Engine, as it was a version behind, the newer version being released a few months back. After upgrading the IPS Engine in the Firewalls (both Trusted and Untrusted HA pair). A reboot was done. Since then the high CPU cycles and session were not happening.
It was concluded that the Flooding of the network by the Camera’s and Machine at Maritime did cause the Fortigate IPS Engine to crash.
We also need to determine who is responsible for managing the firmware (software updates) on CCTV cameras, as they also need patching just as with a normal computer or server.
|Restoration; Permanent or Temporary|
If temporary; please complete necessary actions to be undertaken to prevent reoccurrence
|1. secure all of our other CCTV cameras||FM|
|2. who is responsible for managing the firmware (software updates) on CCTV cameras||FM|