This job is no longer available. Continue your job search here.
Cloud and Infra Management Lead Engineer
Kuala Lumpur
Job No. 12933863
Full-time - Hybrid
Job Description
#LI-GM
Position Overview
We are seeking a highly skilled and experienced Infrastructure Management Technical Lead to join our support team at a leading insurance company. The ideal candidate will be responsible for designing, implementing, and managing Azure cloud infrastructure solutions to ensure optimal performance, availability, and reliability of our IT services.
Key Responsibilities
- Design and implement Azure cloud infrastructure solutions.
- Manage and optimize Azure resources to ensure high availability and performance.
- Monitor and maintain the health of the Azure environment.
- Lead and mentor a team of IT professionals, providing guidance and support in infrastructure management tasks.
- Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.
- Develop and maintain documentation for infrastructure processes, procedures, and architecture.
- Stay up-to-date with the latest industry trends and best practices in cloud infrastructure management.
- Ensure compliance with security policies and regulatory requirements.
- Coordinate with vendors and service providers to manage external resources and support.
- Implement security best practices and ensure compliance with industry standards.
- Collaborate with development teams to support application deployment and integration.
- Automate infrastructure provisioning and management using tools like terraform.
- Troubleshoot and resolve issues related to Azure infrastructure.
- Perform regular backups and disaster recovery planning.
- Stay current with the latest Azure technologies and best practices.
- Provide technical guidance and support to other team members.
- Develop and maintain documentation for Azure infrastructure and processes.
- Participate in on-call rotation for after-hours support.
- Conduct performance tuning and capacity planning.
- Implement and manage Azure networking components.
- Ensure cost optimization of Azure resources.
- Ensure the reliability, availability, and performance of the entire infrastructure stack including compute, storage, network, and cloud components.
- Perform root cause analysis for infrastructure related incidents and implement corrective actions.
- Collaborate with Engineering teams to plan and execute system upgrades and maintenance.
- Design and implement disaster recovery plans and business continuity strategies.
- Implement best practices for monitoring, logging, and alerting across the infrastructure.
- Foster a culture of continuous improvement and operational excellence.
- Collaborate with L3 support team and other engineers to design and enhance the architecture of infrastructure systems, ensuring alignment with business needs and technology standards.
- Making sure that controls are applied and constantly reviewed, primarily against SOX, to ensure full compliance to all our policies and regulatory obligations
- Network/security rules maintenance.
- AKV, PAT, service principal setup.
- Reaching out and work with RT Cloud when needed.
- Monitoring the Azure resource performance and alerts.
- DR Execution and co-ordination.
- Will work on minor defect fixes.
- Documentation of implementation steps/processes, known errors and “How to” articles
- Start/Stop of Application services during maintenance Windows on Linux servers and Windows servers.
- SSL cert updates, License Updates
- Troubleshooting issues related to Application configuration on Linux Server and Windows Server.
- Co-ordination with L3 team for defect fixes.
- Good knowledge of Infrastructure as Code.
Qualifications
Qualifications
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Minimum of 5 years of experience in infrastructure management, with a focus on Azure cloud solutions.
- Strong knowledge of Azure services, architecture, and best practices.
- Proven experience in managing and optimizing cloud resources for performance and cost efficiency.
- Excellent troubleshooting and problem-solving skills.
- Strong leadership and team management abilities.
- Excellent communication and interpersonal skills.
- Relevant Azure certifications (e.g., Microsoft Certified: Azure Solutions Architect) are a plus.; Azure certifications (e.g., AZ-104, AZ-303, AZ-304) are a plus.
- Bachelor’s degree in computer science, Information Technology, or related field.
- Strong understanding of Azure architecture and services.
- Experience with infrastructure as code tools like ARM templates and terraform.
- Proficiency in scripting languages such as PowerShell or Python.
- Knowledge of networking concepts and Azure networking components.
- Experience with monitoring and logging tools like Azure Monitor and Log Analytics.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and collaboration skills.
- Ability to work independently and as part of a team.
- Experience with DevOps practices and tools.
- Familiarity with security best practices and compliance standards.
- Experience with backup and disaster recovery solutions
- Expertise in Azure Cloud maintenance of components such as Data Factory, Data Lake, Databricks, Monitor, Log Analytics, AKV, SQL Server, AAD, Self-Hosted IR, Windows and Linux VMs
- Linux and Windows Server administration
- Knowledge of PowerBI Desktop/Workspace and Power Automate.
- Expertise in working with Terraform, Azure DevOps, Jenkins
- Python scripting skills
- Understanding of networking concepts like IP segments, DNS
- Knowledge of security certificates and encryption methods.
Working Conditions
- Environment: This role is primarily office-based, with the possibility of remote work depending on company policies and operational requirements.
- Hours: Standard working hours apply, with the potential for occasional after-hours support or on-call duties as needed.
- Tools and Equipment: Regular use of computers, financial software, ticketing systems, and standard office equipment is required.