You are tasked with ingesting a large dataset from a cloud storage bucket into your lakehouse using a notebook. Describe the steps you would take to set up this ingestion process, including considerations for data validation, transformation, and scheduling.