Detailed Job Description:
- Objective: The primary objective is to develop a reusable tool tailored for prompt improvement and validation with a strong emphasis on objective performance assessment.
- Foundation: The project will capitalize on our existing prompt engineering tools and harness the potential of our well-established cloud infrastructure to ensure efficiency and scalability.
- Collaboration: The role of Senior Software Engineer will entail close collaboration with internal staff to align tool development with team goals and overarching objectives.
- Initiation Phase: This position is pivotal in launching the development process. While we embark on this journey, we are actively finalising the comprehensive project plan.
- Tool Development: As the lead developer, you will spearhead the creation of a robust and highly reusable tool designed to enhance and validate prompts effectively.
- Integration: Seamlessly integrate the tool with our existing cloud infrastructure, ensuring a harmonious and efficient workflow.
- Objective Assessment: Implement advanced features for the objective evaluation of prompts and the performance assessment of LLM’s.
- Should have familiarity with BLEU (Bilingual Evaluation Understudy): BLEU is a metric commonly used for machine translation tasks. It measures the similarity between the generated text and human-generated reference text. A higher BLEU score indicates better translation quality.
- ROUGE (Recall-Oriented Understudy for Gisting Evaluation): ROUGE evaluates the quality of summaries and text generation by comparing the overlap between the generated text and reference text in terms of n-grams (word sequences). It is often used for summarisation tasks.
- Perplexity: Perplexity measures how well a language model predicts a given dataset. Lower perplexity values indicate better model performance in terms of predicting the dataset.
- F1-Score: F1-score is a metric used for tasks like text classification and named entity recognition. It balances precision and recall, providing a single measure of model performance.
- Human Evaluation: In some cases, human evaluators are involved in assessing the quality of LM outputs. They can rate the generated text based on factors like fluency, relevance, and overall quality.
- Documentation: Develop comprehensive documentation to serve as a valuable resource for both end-users and team members, ensuring smooth adoption and effective collaboration.
- Collaboration: Foster a collaborative environment by working closely with internal staff and cross-functional teams. Ensure that tool development aligns with project goals and objectives.
- Feedback Integration: Continuously improve and iterate on the tool’s development based on user feedback and evolving requirements.
- LLM Experience: A proven track record of working with Language Models, particularly LLaMa or GPT, is essential to effectively understand their nuances and challenges.
- Instruction-Based Prompt Engineering: A deep understanding of instruction-based prompt engineering is crucial for the development of effective prompts.
- Evaluation of Language Model Outputs: Proficiency in assessing language model outputs and utilizing relevant metrics for performance evaluation.
- Programming Skills: Proficiency in Python and Rust is necessary for developing a robust and efficient tool.
Nice to Have:
- Cloud Technology Familiarity: An understanding of cloud infrastructure is beneficial for streamlined integration and scalability.
- Data Annotation Experience: Prior experience with data annotation, especially for NLP tasks, can enhance the tool’s capabilities.
- NLP Knowledge: Familiarity with Natural Language Processing concepts and technologies provides valuable context for tool development.
- Team Collaboration: Strong teamwork and communication skills are essential for effective collaboration with internal staff and teams.
- Technical Skills: Familiarity with relevant tools and technologies used for data and prompt analysis adds depth to your capabilities.
Tech Stack and Libraries:
- Programming Languages: Proficiency in Python and Rust for tool development.
- Cloud Technologies: Familiarity with cloud platforms and services (e.g., AWS, Azure, Google Cloud) for efficient integration. NLP Libraries: Experience with NLP libraries and frameworks such as spaCy, NLTK, Transformers, or Hugging Face.
- Database Systems: Familiarity with database systems (e.g., SQL, NoSQL) for data storage and retrieval.
- Version Control: Experience with version control systems (e.g., Git) for collaborative development.
To apply for this job please visit in.linkedin.com.