226 views
# HISTORY - From zero to hero with Python ## General information **Date**: 21-11-2023 **Time**: 18:00 - 21:00 **Location**: Online ([Zoom](https://epfl.zoom.us/j/63351787229?pwd=RmM5c2RzSzVrTmswb2ludHpJOUptQT09)) **Code of Conduct**: [The Carpentries code of conduct](https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html) ### Before Introduction: who is who? Introduction to BioNT ### Schedule for the workshop | Day | Tutorial | Instructor | | -------- | -------- | -------- | | 1 - Tuesday | [Programming environment setup](http://swcarpentry.github.io/python-novice-gapminder/index.html) | Wolfgang | | 2 - Wednesday | [Tabular data manipulation and visualisation](http://swcarpentry.github.io/python-novice-gapminder/07-reading-tabular.html) | Renato | | 3 - Thursday | [Advanced programming](http://swcarpentry.github.io/python-novice-gapminder/13-conditionals.html) | Rabea | | 4 - Friday | [Build a website using Jupyter Book](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/)| Lisanna | ### How will the workshop be run? Why a specific setup for this workshop? - We want to welcome participants for different arias and (e.g. SMEs, job seeker, academia) - Importance of privacy - We would like to create a interactive atmoshpere How will we do? - Zoom with panel view - Panelist: instructors & helpers - Only trainers will be visible - No personal data will be displayed - This [HedgeDoc](https://biont.biobyte.de/fsB71mmfSGm_fY5t2gG-OA?both#%E2%9D%93-Do-you-need-help-Please-describe-your-issue) document in Markdown for interactions - Markdown: lightweight markup language - [Documentation](https://biont.biobyte.de/features#Edit) ### How to participate? **Ask your questions, raise issues, interact with us in this Document** In addition, to help you navigate this document, we followed the structure of the tutorial and included: - Each Hands-on section (✏️ - where you will have to work) of the tutorial, including a part to ask questions or post issues you might face :::warning ✏️ Hands-on: Topic ##### ❓ Are you finished with this section? Add a '+' below - Yes: +++ - Waiting for the job to be done: - Need help: ##### Your questions Q: Help I dont find the document A: The document is here: https://shorturl.at/abfuI :-) ##### Do you need help? Please describe your issue - ::: A helper will help you - Question sections (❓ - where we ask you something ) for answering :::success ❓ We have some question: - ::: Let's try now! ::: warning ##### ✏️ Hands-on: Set you up - Access this HedgeDoc main document: https://shorturl.at/abfuI - Fill the following questions ##### ❓ Are you on this HedgeDoc? (Add a + when done) - Yes: +++++++++++++++++++++++ - No (please sent a e-mail to one of the helpers) ##### ❓ Do you need help? Please describe your issue - ::: :::success ##### ❓ Have you ever used Markdown? (Add a +) - Yes +++++++ - No ++++++++++++++++ - What is Markdown?+ ::: :::success ##### ❓ We plan to adjust the break times according to the workshop's progression. Does this arrangement work well for everyone? - Yes +++++++++++++++++++++ - No Those that answered no, could you clarify your needs? Do you need a specific break time? ::: ### Programming languages and why learning Python? :::success ##### ❓ Have you ever used a programming language? (Add a +) - Yes +++++++++++++++++++ - No+++ - I have used R but many years ago. So I considered No ::: :::success ##### ❓ What programming languages do you know? - R +++++++++++++ - Matlab ++++ - Fortran +++ - python ++++++ - sql +++++ - Linux, R+++ - html +++ - Haskell + - Php + ::: :::success ##### ❓ Do you have Anaconda installed? (Add a +) - Yes ++++++++++++++++++++++++ - No + !! help - me and another particpant had a wrong/old Hedgedoc document and are areally lost... I just finally gort here... ::: ### Version control system :::success ##### ❓ Do you have a GitLab account? (Add a +) - Yes ++++++++++++++++++++++++ - No You will need it for day 4 ::: :::success ##### ❓ Do have Jupyter Books installed? (Add a +) - Yes +++++++++++++++++++++++ - No + - why:stuck on "Retrieving notices: ...working... " message - If you are using MacOS conda took a very long time for some users (up to 30+ minutes) - see above i just finalyl got here and am lost... - I started from the anaconda navigator to open the notebook and then i opned the directory where the data is - where is the tutorial - As I wrote, I missed the beggining this and have no idea how you openend the jupyter... two of us were in a different document and could not follow along You will need it for day 4 ::: ## Day 1 - Tuesday ### Schedule | Starting time| Duration | Content | | -------- | -------- | ---------------- | | 18:20 | 10 min | [Summary and Setup](http://swcarpentry.github.io/python-novice-gapminder/index.html)| | 18:30 | 20 min |[Running and Quitting](http://swcarpentry.github.io/python-novice-gapminder/01-run-quit.html) | | 18:50 | 25 min |[Variables and Assignment](http://swcarpentry.github.io/python-novice-gapminder/02-variables.html)| | 19:15 | 25 min | [Data Types and Type Conversion](http://swcarpentry.github.io/python-novice-gapminder/03-types-conversion.html)| | 19:40 | 15 min | Break | | 19:55 | 30 min | [Built-in Functions and Help](http://swcarpentry.github.io/python-novice-gapminder/04-built-in.html) | | 20:25 | 25 min | [Libraries](http://swcarpentry.github.io/python-novice-gapminder/06-libraries.html) | | 20:50 | 10 min | Summary + Feedback| :::success ##### ❓ Before we start: Is the screen clearly visible (add '+') Please zoom in ++ ::: ### Start your programming environment We will learn Python using Jupyter notebooks as our programming environment. ::: warning ##### ✏️ Hands-on: Start up your JupyterLab instance and bring up the JupyterLab main page in your browser JupyterLab and the possible ways to launch it are explained in the first two sections of the tutorial's ["Running and quitting" chapter](). JupyterLab should be installed on your machine as part of Anaconda. ::: :::success ##### ❓Can you see the JupyterLab main page in your browser? Add a '+' below - Yes: ++++++++++++++++ - Need time: - Need help: ##### ❓ Do you need help? Please describe your issue - The link to this file wasn't clearly sent to us. I had this link https://biont.biobyte.de/pZFSmefoS66yKDp9U4BGHw?both but nobody was checking it. This means of communication isn't a good idea in my opinion. - We also post the relevant links in the zoom chat. During the workshop also with anchors to the right sections. Were you able to receive the chat messages there? - we don't see the zoom chat. Ok now i see, i didnt see anything there before. - Ok, great. Zoom is sometimes a bit of hassle :-) - Next time we should show how to access the chat in zoom at the very beginning. Thanks for your feedback. - This communication is not good, I think I give up, I am not following what teacher is saying because I was dealing with this. Good luck improving the workshop though, idea is great. - We are here to help you. If you like we can get you back on track. + - same here, I never know where people even read the comments, I posted request for help in 3 sections above... - Was your question about Jupyter Book? - I missed the beginning - Its not helping to show for windows... I have a mac. Command line would have been betetr. - but if you installed anaconda, you already have it. Just start the terminal and type `jupyter lab` - If jupyter-lab is not installed, let us know. - Is it working for you know? - In case anyone is still trying to launch a notebook, please: run the Anaconda Navigator, and from there select Python 3 Notebook - Could you repeat which order is needed to launch jupiter? - Got it! - Great :+1: - In case anyone is still trying to launch a notebook, please: run the Anaconda Navigator, and from there select Python 3 Notebook - +1 - Can I work directly on the command line, or exist another reason to do with this, I have Anaconda installed and Jupiter-notebook but it is more difficult than I use emacs. - This question is being answered live + - I don't need technical help, but what does warning mean on the editing side of this document? - A warning should usually not be an obstacle to use the tool (it's not an error). To know exactly what it means, we would need to know what the warning says :) - If you mean the markdown tag "warning" it is just creating the yellow box on the compiled view on the right. - Thank you, that makes sense. - Could you please repeat again the jupyterlab launching? - The instructor shows the steps. Was it helpful to get it started? +thank you! ::: :::success ##### ❓Can you see the JupyterLab main page in your browser? Add a '+' below - Yes: ++++++++++++++++++++++ - Need time: ++ (am trying to get an answer above...) ::: ::: warning ##### ✏️ Hands-on: Explore the elements of the main page, then launch a Jupyter Notebook see the [JupyterLab interface](http://swcarpentry.github.io/python-novice-gapminder/01-run-quit.html#the-jupyterlab-interface) and the [Creating a Jupyter notebook](http://swcarpentry.github.io/python-novice-gapminder/01-run-quit.html#creating-a-jupyter-notebook) sections of the tutorial chapter ::: ::: success ##### ❓ Are you inside an empty Jupyter notebook? Add a '+' below - Yes: ++++++++++ - Need time: - Need help: ##### ❓ Do you need help? Please describe your issue - Some Markdown, like # is not interpreted. - Are you sure you are in a Markdown cell? And did you add a space after the #? Solved. - To go markdown mode we need to press 'M'. To go back to code, which key do we have to press? - you can just hit the drop down menu to go back to code - or 'y' :) :) - when he referes to "tutorial" which one does he mean? - !!! WHICH CHAPTER IN WHICH TUTORIAL? - You don't need to follow the materials live, just know that there is a self-learning version of this course that you can also consult at a later stage - The link was posted in the zoom chat. - my $ is not creating an equation - Are you in a Markdown cell? And did you run it (by hitting the play symbol)? In any case don't worry too much about markdown now, we will have time to explore it more along the workshop. - The trainer is just going too fast. It's difficult to keep pace and understanding at the same time. If we are not focusing on Markdown, then What's the point of spending time there? - Even though we are not focusing on Markdown, it is a major part of the ipython notebooks and can help you to structure your projects. The focus will be on coding, but you should be aware that the Markdown cells exist. - Also, we will see an application in Day 4. Today, what you really need to know is that there are different types of cells for different purposes (code and annotation). ::: ### First steps with Python #### Variables and Assignment This part corresponds to the contents of the [Variables and Assignment](http://swcarpentry.github.io/python-novice-gapminder/02-variables.html) chapter of the tutorial. ::: warning ##### ✏️ Hands-on: Work through the chapter until you're done with the section [Variables can be used in calculations](http://swcarpentry.github.io/python-novice-gapminder/02-variables.html#variables-can-be-used-in-calculations.). ##### ❓ Do you need help? Please describe your issue - How can I run the program from the begining? All cells - Do you mean running all cells one after another? Or only a single cell? Thx! - Press ctrl+a to select all cells - Then Press shift+enter - But the order in a python scripot is important. You need to define previously. - Yes, that's correct. The instructor demonstrated, that you should keep an eye on running in the correct order. - when I program in R or terminal there is autocomple and also I can use up and down mouse to repeat the same comman. - Use the tab-button for suggestions - this only compeltes - yes works - and repeat the command? - pity!! i do it a lot to correct my mistakes :) - THANKS - Is there a break for bathroom scheduled? I dont want to miss stuff - Yes, according to [the schedule](https://biont.biobyte.de/fsB71mmfSGm_fY5t2gG-OA?both#Schedule) we have a break a 19:40 - Does spacing matter in python? - Yes, in Python whitespace is interpreted. The indentation has an impact on the code. - Okay. So is age= different from age =? - No, in that case it does not matter, but when indenting blocks. Anyway it is good coding practice to have a space before the equal sign. [PEP8](https://peps.python.org/pep-0008/) is a good ressource which gives good suggestions, how to write and structure you code. It's a style-guide for python. - That makes sense. Thank you! ::: :::success ##### ❓ Is the speed fine - Yes: +++++++ - Too slow: - Too fast: ++ ::: ::: success ##### ✏️ Explore arithmetic operations in Python 1. Define two variables `a` and `b` and assign them the integer values `5` and `3`. Instead of `a` and `b` you can, of course, use any other valid variable name :-) 2. Print the result of adding the two values. 3. Print the result of subtracting the two values. 4. The operators for multiplication and division are `*` and `/`, respectively. Use them with your variables. 5. Compare the result of `(a + b) * b` and `a + b * b`. 6. Use also the additional operators `**`, `//` and `%`. What are they doing? "**" makes exponentiation. what about // ? **💡 Hint**: the last two operators might be hard to understand. Try running: `help(divmod)` to display the help for another built-in function `divmod`, which combines the functionality of both `//` and `%`. ##### ❓ Do you think you have the answers to at least some of the above questions? Add a '+' below - Yes: ++++++++++++++++++++ - Need time: - Need help: ##### ❓ Do you need help? Please describe your issue - should i always run into different cells? - No, you can also use the same cell. Try to structure your code in a way that you have parts that belong together in one cell, that way you are less likely to make mistakes when executing cells in a wrong order. - What do // or ** do? - That's a question in the exercise :) try to find out, and later we will comment it - `//` is the floor division. - '%' gives remainder from division ( e.g. modulo) ##### ❓ What are your answers? - a-b -1 (a+b)*b 15 a+b*b 11 a**b 8 a//b 0 a%b 2 ::: :::success ##### ❓ Are you done? - yes +++++++++++++++ - need time ::: ::: warning ##### ✏️ Hands-on: Continue with the remainder of this [chapter](http://swcarpentry.github.io/python-novice-gapminder/02-variables.html) ::: #### Data Types and Type Conversion This part corresponds to the same-named [chapter 3](http://swcarpentry.github.io/python-novice-gapminder/03-types-conversion.html). ::: warning ##### ✏️ Hands-on: Work through the chapter up to [Can mix integers and floats freely in operations](http://swcarpentry.github.io/python-novice-gapminder/03-types-conversion.html#can-mix-integers-and-floats-freely-in-operations.). ##### ❓ Do you have questions? - where do we start? - The link is in this yellow box on top -so we read the entire chapter ? - So right, may be a bit confusing. You take this link: https://swcarpentry.github.io/python-novice-gapminder/03-types-conversion.html. Work through the chapter until "Can mix integers and floats freely in operations". - how much time do we get? - thanks - I am not sure why we have to write half is and three squared is ? it works also if we just type in the numbers, can you please clarify why we wrote "half is and three squared is?, thanks " - The print statement contains a message to improve the readibility. The result of the calculation would be the same without these messages. - Lets you practice print :) - In addition, each notebook cell prints automatically the value of its *last* expression (line of code), but that's meant for debugging and obviously is not robust against changing the content of a cell. - do we go to break now ? - There will be a break after this part, roughly around 19:55 - I keep making bracket errors, is jupyter book not giving warnings about that? In RStudio the syntax is hightlighertes and I get an hint before executing.. - It might be less obvious in the notebook, but if you put your cursor at a bracket, it will show you the matching closing bracket or turn red if it is missing. - ha, not in my version..... :( - yes, the green see it now.. VERY subtle.. - i dont understand why print(str(1)+'2')=12 - because you cocatenate two chracters here, this is not a number but a character.msse - str(1) is converting the integer to str. Then two strings get concatenated. The result is '12' not 12. - ah i see.tq! - I don't fully understand the "separator" example. - Can you explain, where in the example the problem is? - I just don't understand what's being done in python. - It demonstrating, that you can use the *-Operator with strings. When you take a string, say 'a' and use the *-Operator then the string is repeated n-times. Example: 'a'*4 will return 'aaaa' - So this '='*10 will create a seperator bar when printed. - Ohh, I see! Thank you. - My codee in not running! When I type a line of code and press shift+entr, nothing happens and the line number is turned into *. Any idea? - Plese check if your kernel is still running. If not please restart the kernel. Select in Menu bar "Kernel" and "Restart Kernel". Did that help? yes, thanks - This means the code *is* running, but hasn't finished, if it doesn't change follow the steps above. - so when do we use float ? sorry i didnt get it ? - Floats are needed if you have floating point numbers, so for example if you divide 1 by 2, you get 0.5. This cannot be stored in an integer. If you try int(0.5) (casting 0.5 to an integer), you will see the issue. ##### ❓ Have you completed the mix integers and floats section? - Yes:+++++++++++++++++++ - More time please: - - did not see the box, was in the swcarpentri3e doc!! - We added it just before announcing it, no worries ::: ::: success ##### ✏️ Don't use floats just because you can! While it is technically true that you can mix integers and float freely, don't do this when it is not required! The `int` type in Python is a very sophisticated type highly optimized for the task of storing integer numbers of *any* size. Try it yourself: ``` a = 3*10**16 b = a + 1 print(a, b) ``` Nothing special to see here, you think? Well, try with floats: ``` a = float(3*10**16) b = a + 1 print(a, b) ``` ❓ Do you have an explanation for this behavior? - Yes: ++++ - I think I got it: +++ - I am lost: + - I am not sure why we have to write half is and three squared is ? it works also if we just type in the numbers, can you please clarify why we wrote "half is and three squared is?, thanks " a is assigned the result of the expression 3 * 10**16. The result is an integer. In the second example, I use float, a is float then the result will be a float --- **Conclusion**: There are good reasons for using floats in programs, but don't give up the safety of Python integer arithmetics if you don't have to. ::: <!-- #### Add this box after each break --> Let's come back at 20:15 (CEST) :::success ##### ❓ Are you back? - Yes ++++++++++++++ - No ##### ❓ Any questions regarding what we did until now? - Q1: - A: --- - Q2: - A: ##### ❓ Is the speed fine - Yes:++++++++++++ - Too slow: - Too fast: now yes, it was really hard to navigate hedge, zoom, jupyter, tutorial etc, too fast at beginning +++ ::: #### Built-in Functions and Help This part corresponds to the same-named [chapter 4](http://swcarpentry.github.io/python-novice-gapminder/04-built-in.html). ::: warning ##### ✏️ Hands-on: Work through the chapter up to and including the section [Functions attached to objects are called methods](http://swcarpentry.github.io/python-novice-gapminder/04-built-in.html#functions-attached-to-objects-are-called-methods). ##### ❓ Do you have questions? - From where to where shall we work and how long do we have?` - Work through chapter 4 up to and including the section [Functions attached to objects are called methods](http://swcarpentry.github.io/python-novice-gapminder/04-built-in.html#functions-attached-to-objects-are-called-methods) - *Please explain more about the None. I thought i will get result of print is example. * - None is a datatype that represents the absence of a value. It is similar to void in C or Null in Java. - Would the 'None' in Python be equivalent to the 'NA' in R? - I'm not using R, but I suspect NA means a missing value, which is somewhat different from the None type. - In R NULL is the equivalent to None - why the result of the code result = print('example') isn't 'example' in the print('result of print is', result)? - ok, answered just now and above - Why is 0 considered minimum among 0, a and A? - If you consider the minimum between "0", "a" and "A", meaning the string a and not the variable. It uses the [ascii(encoding)](https://en.wikipedia.org/wiki/ASCII) number of this letter, which is in fact a number. :+1: + - Can I change EVERY Functions into an internal Python operations? The exampe shows len() and ._len_() - These functions you refer to are named dunder (double-underscore) or magic methods. You can use these magic methods in your own objects so that they behave like built-in objects. That means you can use len(obj) when your custom object implements `__len__()` - When I apply a method to an object, is the result saved automatically into the object? - No. The method could modify the state ob the object, but it does not necessarily have to. For example you can have something like a static method (this how the concept is called for example in Java). Static methods behave always the same and do not consider or touch the state of an object. - how would you define 'internal python operations' and why are these methods marked with double underlines? - These methods are magic methods. Marking them with double underlines is a convention. Think about the built-in `len`. Using `len` on a object is actually calling obj.__len__() behind the scenes. The data model of Python was designed in that way so that builtin-functions can be implemented in every object. See the Python Data Model for more details: https://docs.python.org/3/reference/datamodel.html# #### ❓ Are you done? - done: +++++++++++ - more time ::: ::: success ##### ✏️ Bonus info and task One of the advantages of methods is that they provide a natural way of bundling an object with actions you may want to perform with it. This makes it easy to discover functionality. If, for example, you're wondering if there is a built-in way to strip leading/trailing whitespace from a string, you can just look through the methods of the string. The built-in function `dir()` lists all methods and attributes of an object. It works with a concrete object, but also with types, so if `user_input = "Yes "` both of these can be used to look for a suitable method: 1. `dir(user_input)` 2. `dir(str)` ❓ Can you find and use a method that will turn the *user_inputs* `" Yes"`, `"Yes "`, and many more variations into just `"Yes"`? - Yes, found it: - A bit more time please: - Cannot find it:+++ ::: ::: warning ##### ✏️ Hands-on: Continue with the remainder of this chapter Go up to 'Python reports a runtime error when something goes wrong while a program is executing' #### ❓ Do you have a question? - I am not sure the help() is sufficient at the begining, is there a more in-depth help with examples? e.g. as you find in R with ? and ?? - An alternative is using the name of the method followed by an ?. For example `print?`. But in most cases I think it is not more useful than help. - https://docs.python.org/3/ - I run the dir() and there is a list of functions, not sure how i will use them though? - Answered by Instructor ::: #### Libraries This part corresponds to the same-named [chapter 6](http://swcarpentry.github.io/python-novice-gapminder/06-libraries.html). ::: warning ##### ✏️ Hands-on: Work through all of the chapter ##### ❓ Do you have questions? - is that a hands on box? ![](https://biont.biobyte.de/uploads/980871b7-6030-4cdb-9420-10934f4c690e.png) --> is picture see view mode - These boxes contain exercises to test your knowledge. #### The bonus task: You want to select a random character from a string: ``` bases = 'ACTTGCTTGAC' ``` 1. Which standard library module could help you? 2. Which function would you select from that module? Are there alternatives? 3. Try to write a program that uses the function. - - 1- Random (yes, but case matters!) Step by step when I don't know which library to use in Python. For example, I would like to see what Python libraries are available for generating random numbers. Go to https://docs.python.org/3/library/ Enter "random" in the search box. Various options appear, usually some libraries such as random or secrets. ``` import random bases = 'ACTTGCTTGAC' print(random.choice(bases)) ``` The same example as before, but using secrets ``` import secrets bases = 'ACTTGCTTGAC' print (secrets.choice(bases)) ``` A third example using another library numpy ``` import numpy as np bases = 'ACTTGCTTGAC' print (np.random.choice(list(bases))) ``` - how do you even begin to know which library to use? - In this case you would know that you want to select a 'random' character, so you would look for a library that gives you functionalities to deal with randomization. - It always depends on the specific use-case you have. That makes sense. Thank you! - i have no idea how to even start... waiting for instructor to walk us through - First starting point: import the random module. Then Look around if there is any function the suits your needs. - any chance to show us around the docs.python site--? - https://docs.python.org/3/library/random.html - Look for functions for sequences - Which is the diference between `from random import choice` and `import random`? - If you use `import random` you import all the functions from the random library. In that case you would need to use `random.choice()` to use the function. If you instead use `from random import choice` you import the specific `choice` function. :+1: ::: ### Feedback :::success ##### ❓ One thing that was good about today - having accesss to the resources where I can go over the exercises again (sw carpentries, know my way around there now.. ) ++++++ - Being shown different ways to arrive at the same desired outcome. - I think it was a good start :) thanks for your efforts ++++++ - good explanations and access to tutorials and possibility to practice +++ - Being able to go to the tutorial under the guidance of the speaker and the staff helping in the Hedge document. +++++ - its good that we have the exercises and the materials+++ ##### ❓ One thing to improve - well, we've been there, the mixing of 4-5 tabs/apps/docs was difficult +++++++ - mentioning before, that it would be better to use two screens for the workshop - Thank you for your feedback. We got the same request during the first workshop so this time we added to the registration page under the recommendations: "To follow the workshop more efficiently, we recommend having a two-screen setup" - it is easier for me after previous biont workshop, you will get there too - Add examples section, in this document, where we can put our examples directly. - a bit slower start with launching the jupyterlab (especially using the command line as the data files need to be in the exactly same path as the instructors, I got lost a bit as I had it in the downloads) - it would be nice to learn how to run it through terminal++ - Thank you for your feedback we can show how to do that in the next days - Communication/links to documents was hard to navigate in the beginning. But i've got the hang of it now. It would be so much easier if this was previously communicated and the links provided the day before the workshop. - It was really fast in the beggining so really difficult to follow, and also definitely need at least 2 screens ##### ❓ Any other comments? - I would love to have a quick intro to the help site as I wrote above.. i know that these sites are essentiaöl - good explanations, speed was a t thebegiining too fast, but time for the exercises was good enough - are we going to have at some point, applications for python in real life problems ? + - Which applications would you like to see? - Will there a be a video available of today's and the rest of the days explanations for the participants? - Yes, we will send the video as soon as possible (but after the workshop) Perfect. Thank you! :+1: enjoy evening and rest ! CU tomorrow. Thanks Wolfgang and see you tomorrow ::: ## Day 2 - Wednesday ### Schedule | Starting time | Duration | Content | | ------------- | -------- | ---------------- | | 18:00 | 10 min | Welcome + Summary | | 18:10 | 25 min |[Reading Tabular Data into DataFrames](http://swcarpentry.github.io/python-novice-gapminder/07-reading-tabular.html) | | 18:35 | 35 min | [Pandas DataFrames](http://swcarpentry.github.io/python-novice-gapminder/08-data-frames.html)| | 19:10 | 35 min | [Plotting](http://swcarpentry.github.io/python-novice-gapminder/09-plotting.html) | | 19:45 | 25 min | [Lists](http://swcarpentry.github.io/python-novice-gapminder/11-lists.html) | | 20:10 | 15 min | Break | | 20:25 | 30 min | [For Loops](http://swcarpentry.github.io/python-novice-gapminder/12-for-loops.html) | | 20:55 | 5 min | Summary + Feedback | :::warning ##### ❓ Do you have any questions about yesterday's content? You can see all questions and answers from yesterday [here](https://biont.biobyte.de/s/TfzaOoBp0#) Q: THIS MARKDOWN FILE IS EMPTY!! I was in the middle of going through all of yestredays exercises again, how can this be?? Until rewcently this document had all info on yesterday! A: Please refer to the history [here](https://biont.biobyte.de/s/TfzaOoBp0#). Here you can find the content of yesterday. - its really complicated to follow all these documents at all.. I also teach with markdown, no problem to have a tab for previous days.. why the hazzle? - Because we need to design a system that works with a huge number of learners at the same time, and this document can grow a lot during the lesson (see yesterday). A very similar solution is implemented by another training entity with similar needs (managing big numbers of learners), i.e. CodeRefinery. The history is linked at the top of this yellow box, do you see it? Q: A: Q: A: Q: A: ::: :::success ##### ❓ Before we start: Is the screen clearly visible (add '+') Yes ++++++++++++++++++ + NO - Please zoom in --- I cam two minutes late and am already lost again WHZARE ARE WE!!!!!!!!!!!!!!!!!! YOU KEEP SAYING ADD PLUS, but WHERE!!!??? - We are just starting. You should open a notebook and follow the instructor to name and save your file. Do you need help here? Add a + above if you are ready. ::: ### Reading Tabular Data into DataFrames :::success ##### ❓ Jumping in: Were you able to import pandas and load the data? (add `+`) Yes ++++++++++++ - No (please copy-paste any error message you received) --- - error: ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1 ----> 1 import pandas as pd ModuleNotFoundError: No module named 'pandas' Mambaforge - use 'conda install pandas' in your prompt or termial - but my terminal is runnign and I di - ok, shall I just give up then? He said that ist impoassible to fololw along - No please try to install it and we try to get you up to speed - H: i did install and in my notebook used import pandas as... no error. Does it mean sucess? Any way of TESTING if it was successful? - You will test it very soon. But yes, it means it worked - tahnk yiuz - And now my unzipped data is not in the same folder. shall I move it? - not necessarily, but you will need to knwo where they are. If they are not in the data subfolder, your path will look different than Renato's - can i know how to unzip the files in jupyter? - while the jupyter is active or shall I stop my notebook? - you don't need to close the notebook. Just unzip the files in the folder it is in. - how did you view the data like this ? - Can you explain how you would like to see the data? - I dont see it in a table like you - you mena inside your notbook on the right side? - i dont have the left panel - please go to the very left side to the notebook on the top you see a folder symble. please click here. It the folder should show up. - I dont see this, so my Jupyter notebook have a different views so I have to open the notebook in a completely new tab - are we supposed to WATCH or DO as he speaks??? - Please try to listen and follow along if this is to fast we will ask Renato to repeat - iT IS DEF. ***too fast.*** - you still speak fast - hope the speed is fine once you get working along. - error: SyntaxError: invalid syntax (I had to use a differnt path as data is elsewhere.. ) - Can you share what your coment was? You would need to specify the location of your data. 'data_oceania = pd.read_csv(/Users/NAME/PY_Course_2023/gapminder_gdp_oceania.csv)'' - try `data_oceania = pd.read_csv('/Users/NAME/PY_Course_2023/gapminder_gdp_oceania.csv')` - yes thank you - forgot the quotes. Its a bit too fast for me, so I get hectic... - all good me too some times as long as you catch up :-) - i still need time because my directory in noregognised so i need to fix it - that's ok you can also skip the next comment or I catch you up onec you are ready - for some reason I cant set a directory, its still doesnt recognise it! Is there something specific that I should be careful with? - module 'pandas' has no attribute 'read_csv' - could you share your comment? - It gives me this error - WhiAttributeError Traceback (most recent call last) Cell In[5], line 1 ----> 1 data_oceania = pd.read_csv("Downloads/Ciencia/python-novice-gapminder-data/data/gapminder_all.csv") AttributeError: module 'pandas' has no attribute 'read_csv'ch li -***please repeat the code to load only a country*** -I have repeat and I have the same error: AttributeError: module 'pandas' has no attribute 'read_csv' I have import it but I don¡t know if it has been imported correctly - did you import the module first correctly `import pandas as pd ` I think I have done it but I am not sure if it has worked correctly - Please check again with import and if it works fine then should work in this read_csv lien as well ok, thanks - Yesterday we learned to use integers NOT float... wha is that different now? interger is 1 and float 1.0 (as an example) - It depends on your task sometimes you need flotes if you want to compute digit after the dot. If not you can use integers - how many rows does python check to set the data type? the whole file? - Yes, if all values are consistent it will assign the right type - also if the file is very big? yes, reading long files is not really a big deal for python - and if not? :-) if the values are not consistent, it will assign the type string I believe (the more generic) - can we occassionally get a minute to type what he types? - you find the speed to fast to follow ? - not dramatically, I just first listen because its very interesting and information-rich, and kind of miss the time to also read all comments in hackMD ANd also type it all in... - I see, If you miss one section you can always go back to this document and the [material](http://swcarpentry.github.io/python-novice-gapminder/instructor/07-reading-tabular.html) - Which other types of data (format) can we read with pandas? - there are many like csv, json, excel, Parquet etc. - why am I getting this? NameError Traceback (most recent call last) Cell In[6], line 1 ----> 1 print(data_oceania_country.describe()) NameError: name 'data_oceania_country' is not defined - you should first read the file like ` data_oceania_country = pd.read_csv('gapminder_gdp_oceania.csv', index_col='country') print(data_oceania_country)` - and then try this `print(data_oceania_country.describe())` - where are the exercises? - Check the boxes below **Our code to share** you can setup the path in a variable, for example data = "Develop/from0toHeroPython/data/gapminder_gdp_oceania.csv" then you can use this variable in the pd.read.csv(data) -please share the code , I have to run this `print(data_oceania).info()`, - did you wanted to execute a print of the .info()? Than do this `print(data_ociania.info())`. If not please describe your issue with this line? - you can type only the name of the variable without print (in the notebook) for example: oceania.columns() - okay, thanks, I got it :+1: - sorry i think i am lost... where can i get the microbes.csv file? - you don't need to download it. Just try to answer in the pad. - ohhhh okie okie thanks :) :+1: ::: :::success ##### ❓ Inspecting our data: How many years do we have data for? Add a number after a `-` or a `+` after the number if it exists - 12 +++++++ - 12, but I counted from the print... I am sure there is a way to ask how many columns there are?? - try this `num_rows, num_columns = data_oceania_country` - I do not know how to write this as code.. - `print(data_oceania_country.shape[1])` - but that us without the `num_rows` command... ? - It's not the best way, but... `print(data_oceania_country.T.count())` ::: ::: warning ##### ✏️ Hands-on: Reading Files in Other Directories The data for your current project is stored in a file called `microbes.csv`, which is located in a folder called `field_data`. You are doing analysis in a notebook called `analysis.ipynb` in a sibling folder called `thesis`: ``` your_home_directory +-- field_data/ | +-- microbes.csv +-- thesis/ +-- analysis.ipynb ``` What value(s) should you pass to `read_csv` to read `microbes.csv` in `analysis.ipynb`? - `../field_data/microbes.csv` +++++ - 'your_home_directory/field_data/microbes.csv' - "your_home_directory/field_data/microbes.csv" + - “your_home_directory/field_data/microbes.csv” - Complete path to the file --- - I only have gapminder documents... where is microbes?? - You don't have the file, that's just the example of this exercise. It's an hypotetical situation. - gosh, I waws searching and searchung for the downloaded material... -sorry for that but please listen and follow TAHANK YOU - still have a question: doesnt look that panda its working for me: ==> WARNING: A newer version of conda exists. <== current version: 23.7.4 latest version: 23.10.0 - This is not an error, it's a warning. It should not prevent you from using conda (the old version you have). Please update conda by running $ conda update -n base -c defaults conda Or to minimize the number of packages updated during conda update use conda install conda=23.10.0 - so now I cant continue at all - What is your currnt conda version? 23.7.4 -I think its working but now i cant run> data_oceania = pd.read_csv('data/gapminder_gdp_oceania.csv') this because of an error No such file or directory: 'data/gapminder_gdp_oceania.csv' but the code data_oceania_country = pd.read_csv('gapminder_gdp_oceania.csv', index_col='country') print(data_oceania_country) is running - Ok, let's analyse the difference beween the two lines of code. In the first one, you try to use `data/gapminder_gdp_oceania.csv`. In the second, `gapminder_gdp_oceania.csv`. if the second line works, I suspect that your data is not in the `data` subfolder but just in the same folder of your code. `data_oceania = pd.read_csv('gapminder_gdp_oceania.csv')` should work. yes now its working, but i dont understand then how i actually made that happen. eg in R zou set your directory am I doing the same here? - Not really, consider that in Python your "working directory" is always where the code is, you cannot really change it. So, if your code is in the same folder than the data files, it will "see" the files with no need to specify the subfolder `data`. Instead, Renato has the code in the main folder and the data in a subfolder within this folder, so his full path to the data is `data/filename`. - How can I see all the folders like renato, when I try to run read_csv? - Start writing read_csv() and with the cursor within the parenthesis hit the Tab button. Does this work? Deìie ::: ::: warning ##### ✏️ Hands-on: Writing Data As well as the `read_csv` function for reading data from a file, Pandas provides a `to_csv` function to write dataframes to files. Applying what you’ve learned about reading from files, write one of your dataframes to a file called `processed.csv`. You can use `help` to get information on how to use `to_csv`. --- - how did he get the mouse-over help?? seems very useful.. - shift+tab to get the popup <-- NICE THANK YOU - In Jupyter Notebooks, you can get help. If you type a function followed by a question mark (?) and execute the cell, it will display the documentation for that function. Dose this help you? Did you wanted to know something different? I'm done (add a `+`) +++++++++ --- **Shared code** - LFB: oceania.to_csv("path_to_file"), for example oceania.to_csv("data/proceesed.csv") then you can see in the left panel the new file. - data_oceania_country.T.to_csv('processed.csv') .to load it into the wd.. ::: ### Pandas DataFrames :::success ##### ❓ Where are we: is the difference between `iloc` and `loc` clear? (add `+`) Yes ++++++++++++++ - No (can you elaborate?) - --- - How do I extract a whole row/column? - coverd next :+1: - when he usese 0:10 why is it not ILOC again, its not by name (loc) - sorry can you tell me the full call. I misst this. - he is only using .loc now in all of the commands... but sometimes he is no referring to a name, but just a number... I thought in that case we should use ILOC - I don't really see this, Renato is using column names now - I cant scroll up his screen, but there was an example he had 1:5.... AH he talks now about it... - I also just tested loc with numbers and it dose not throw an error but also dose not give the applied indices. So please stick to the rule Renato just mentioned. - Can I do 'print(data.loc["gdpPercap_1952" , "Albania"]) ?? - you specify fist the row and than the column, so I guess you would need to switch gdp and the country. And than just try it. :-) - - shouldn't (0, 0) be the country? - If you type df.iloc(0,0) you get a single value. The one of the first rwo and colum. Our rows are the countrie++s+ and the colums are gdps so you get the gdp of 1952 of Albania. Does this helps? - Yes. thank you. - this gives an error: print(data.loc["Albania", :5]) - message: TypeError: cannot do slice indexing on Index with these indexers [5] of type int - you mixed up the indices please try something like this `data.loc["Albania", :"gdpPercap_1957"]`. Does it work? - is it possible to also extract the country to which the max value is assigned to? - Short answer is yes, but it's not straightforward considering what you learned so far. Wait a little more and it might become clearer. - `.idxmax()` would give the names, but no ideas how to concatenate both --- **Share code** LFB: data.loc[:,"gdpPercap_1952"] another example data.loc["Albania",:] ::: :::warning ##### ✏️ Hands-on: Extent of Slicing Given the code: ```python print(data_europe.iloc[0:2, 0:2]) print(data_europe.loc['Albania':'Belgium', 'gdpPercap_1952':'gdpPercap_1962']) ``` Explain or add a `+` to an existing explanation in the following questions. 1. Do the two statements below produce the same output? - No +++++++++ - I guess we have to use just ""data" not data-europe.. ++ - while `iloc` regards the vector as an open one and doesn't use the last position. Using `loc` with the column names takes all of them, including the both stated and the ones inbetween. ++ - iloc seems to ignore the column and row headers and instead counts them as row or column 0. - the loc example refers to them by name. 2. Based on this, what rule governs what is included (or not) in numerical slices and named slices in Pandas? z - - setting the indexes - Strings are all inclusive when choosing the range(loc), indexes follow slicing rules (iloc) - --- - WOWO the slicing is different in R! This blows my mind.... - - Thank you for pointing this out - The problem I have in Python so far is that referrring to names there is no auito-complete tab.. in R once loaded I can use tab to complete exprewssion... - can you use the tab key for auto-complete? It should work. - Oh, i will try°! so far it did not - try `data.` (press tab key) - AH YES THAT works, but not for the "gdpPercap [tab]" would complete to a year... - It could be that it does not work for strings. I would have to check (I can also not complete here, not sure if one can change this) - I have problem with mask, can you say me if is ok, data[mask]? - Can you tell us what is the error you are getting ? - NameError Traceback (most recent call last) Cell In[62], line 1 ----> 1 print(subset[mask]) NameError: name 'mask' is not defined - did you run first this `mask = subset > 10000 print(subset[mask])` ? thats how you first initilize/define the `mask` variable. - yes - try to re-run if the above line excute correctly. Rerun and the same error. - Can you paste your exact code? Did you maybe include a typo in `mask`? ``` subset > 10000 print(subset[mask]) ``` - You are missing the start of the code line `mask = subset > 10000` - Ok, thanks! - Is it possible to give colors to True(Green) and False(Red)? - you could used e.g. the df.style settings to change the style of the output. But this exceed a bit the scope of this beginners workshop. **Shared code** LFB: If you wish to use numbers, and iloc you can add a column to the file,, using for example data.insert(0, 'Numero de Fila', range(1, len(data) + 1)). ::: :::warning ##### ✏️ Hands-on: Practice with Selection Assume Pandas has been imported and the Gapminder GDP data for Europe has been loaded. Write an expression to select each of the following: Explain or add a `+` to an existing explanation in the following questions. 1. GDP per capita for all countries in 1982. - `print(data.loc[:, "gdpPercap_1982"])` +++++ - - - - 2. GDP per capita for Denmark for all years. - `data.loc["Denmark", :]` +++ - `print(data.loc["Denmark", :])` ++ - - - 3. GDP per capita for all countries for years after 1985. - `print(data.loc[:, "gdpPercap_1985":])` ++ do we have the data for year 1985? ## Note, is it not true, that this answer is correct if the following columns are consecutive?. - No but I don't get this error. - `print(data.loc[:, "gdpPercap_1987":])`+++ - `print(data.loc[:,"gdpPercap_1987":"gdpPercap_2007"])` + - 4. GDP per capita for each country in 2007 as a multiple of GDP per capita for that country in 1952. - sorry i dont understand this question - explained by Renato. Is it clear? - `print(data.loc[:, "gdpPercap_2007"] / data.loc[:, "gdpPercap_1952"])`+++ - - - --- ::: **Let's come back at 20:05 (CET)** :::success ##### ❓ Are you back? - Yes +++++++++ - No ##### ❓ Any questions regarding what we did until now? - Q1: For the last question (2007 as a multiple of GDP per capita), there's any way to be sure about the operation that you're doing? (divide for that country) - A: Thanks! :+1: --- - Q2: - A: --- - Q3: - A: ##### ❓ Is the speed fine? - Yes: +++++ now:+, at beginning I needed to orient with documents - Too slow: - Too fast: +++ (after a long day) - Yes it is late in the day but hopefully you can still follow. ::: ### Plotting :::success ##### ❓ Jumping in: Were you able to import matplotlib? (add `+`) Yes +++++++ - No (please copy-paste any error message you received) - its not installed, looking up how to do that now... (sorry, i have several anaconda and all old on my laptop, theefore decided to use mambaforge instead to be clean.. ) - All good let us know if we can help you. The next sections should be better. If you wish to check your installations with us please email the organizers - good it seems, goign back to jpyther lab now - and YES, help with sorting out my anaconda mess would be great.. - why float on the x-axis? - float values on the x-axis of plot is that Matplotlib automatically determines the tick locations based on the data provided. Matplotlib might still choose to display them as floating-point numbers if the range of values is not large. - ¿can I zoom the plot? - you could specify the size of the plot. Is this what you want to know? If you wish to have interactive plots you can use the [plotly](https://plotly.com/python/) library. - although I am using a 64-bit system, I get dtype = "int32", is this a problem? - the int32 comes most likely from python? In what context did you get it? - How can I change from dtype='int32' to dtype='int64'? - you could cast the datatype by using `astype` but as Renato said it is not needed in our case. - could we get the code from his book? I am behind with typing and the screen goes up and down so I do not see all of the code... (I stopped with the gapminder data plot...) - You could follow the [tutorial](http://swcarpentry.github.io/python-novice-gapminder/instructor/09-plotting.html) for the sections you where missing. Does this help? - YES, did not realise it was in the swcarp file!! - Now I have an error that pd is not defined as in data = pd.read..... --> I re-opened the book, maybe some import was lost? - you need to import pandas like this: `import pandas as pd` - a GENERAL question on this: when I open a jupyter book, each line has to be executed again? is there a command to execute ALL, so I can just continue at teh end? - If you save the file and session so don't need to re-run from the beginning - I am getting an error instead of a plot: `AttributeError Traceback (most recent call last) Cell In[7], line 1 ----> 1 plt.plot(time, position) 2 plt.xlabel("Time in hour") 3 plt.ylabel("Pos in km") File ~/mambaforge/lib/python3.10/site-packages/matplotlib/_api/__init__.py:217, in caching_module_getattr.<locals>.__getattr__(name) 215 if name in props: 216 return props[name].__get__(instance) --> 217 raise AttributeError( 218 f"module {cls.__module__!r} has no attribute {name!r}") AttributeError: module 'matplotlib' has no attribute 'plot'` - Please import the module first by running `import matplotlib.pyplot as plt` - I had: "import matplotlib as plt" - why does the .pyplot matter?the package name is just matplot lib.... - bcz here were are trying to import the specific submodule (pyplot) from the matplotlib for our work. I would recommend you please follow the instruction. If you only write matplotlib then you should probably use `matplotlib.pyplot.(then other function)` - got int32 is this significant? me too - It is not important for us today. The int32 has enough storage for our tasks. - after replacing my data, the years' labels still have gdpPercap_, have I missed something> sorted thanks - Did you tried this line `years = data.columns.str.replace('gdpPercap_', '')` ? Please share the line you have. - Maybe I missed it, but what exactly is the purpose of "T" in data.T.plot()? - create a plot of the **transpose** of a DataFrame - thanks, got it - - - - - - --- **Shared code** (please don't add content outside the boxes, otherwise it becomes difficult for the helpers compiling the pad to know where to add) LFB: you can use for example ``` data.loc["New Zealand"].plot() data.loc["Australia"].plot() and run together ``` similar to `data.T.plot()` ::: :::success ##### ❓ Clarification: Why did we have to add the `.str.` attribute to our `.replace()` function? (add a `-` with your explanation or a `+` to existing explanations) - Because else it would be a dtype='object' and could not be used in the same manner as a string. + - - Because the columns names are string, then we need to use str to replace it. + #### ❓ Questions? - How could we find any other options that we have in the matplot? - you could use the python build in help function or check there [documentation](https://matplotlib.org/stable/). They even have [examples](https://matplotlib.org/stable/gallery/index.html), which could give you ideas. Additinaly you could use the tab completion to see the possible functions (as Renato just showed). - - I am stuck with tryong to replicate the plots from gapminder... so I will not attempt the below exercises but try to sort out my mess :) - That is totaly fine. Let us know if we can help. - Hi thanks for all the effort :) I am wondering do we get a certificate at the end of the course ? - Yes we give certificates at the end of the wrokshop. You would need to sent us you notebooks from today on. Please save the notebooks every day. We will tell you on the last day how you can get a certificat. - okie thats great :) Thanks. If i cant attend the last day, can I do the exercise on my own and send you the notebook after ? - Yes you can do this, - perfect thanks :) - please check the shared document history [here](https://biont.biobyte.de/s/TfzaOoBp0#) after the last day to get the instructions - Thanks :) - -will you share the recordings of the sessions at the end ? ::: :::warning ##### ✏️ Help!!!: More Correlations This short program creates a plot showing the correlation between GDP and life expectancy for 2007, normalizing marker size by population: ```python data_all = pd.read_csv('data/gapminder_all.csv', index_col='country') data_all.plot(kind='scatter', x='gdpPercap_2007', y='lifeExp_2007', s=data_all['pop_2007']/1e6) ``` ![](https://swcarpentry.github.io/python-novice-gapminder/fig/9_more_correlations_solution.svg) Using `help()`, `jupyter` completion, online help or other resources, explain what each argument to `plot` does. Explain or add a `+` to an existing explanation in the following sections: ###### kind= - Make plots of Series or DataFrame. Help: with shift + TAB Signature: data_all.plot(*args, **kwargs) Type: PlotAccessor String form: <pandas.plotting._core.PlotAccessor object at 0x7ff15d60a690> File: ~/anaconda3/lib/python3.11/site-packages/pandas/plotting/_core.py Docstring: Make plots of Series or DataFrame. Uses the backend specified by the option ``plotting.backend``. By default, matplotlib is used. ---------- data : Series or DataFrame The object for which the method is called. x : label or position, default None Only used if data is a DataFrame. y : label, position or list of label, positions, default None Allows plotting of one column versus another. Only used if data is a DataFrame. kind : str - - - - ###### x= and y= - x : label or position, default None - y : label, position or list of label, positions, default None - - - - ###### s= - the size of the circle+ - - - - - - - the png saving worked for me in jupyterlab. It works for me too ::: ### Feedback :::success ##### ❓ One thing that was good about today - I learned new things. It was very interesting.+++ - Awesome! I can follow it completely although I am a newbie + - Actually working with data helped me thoroughly understand what I was doing and what the commands were doing - Very helpful ++ - very interesting session. Learned a lot + - very helpful session, thanks + - I was able to follow along, except for the plotting, will have to try recapping that with the swcarp document myself+ - LFB: good - good explanations of all that we did! Thank you Renato for your patience + - the coding part todaz was easier to follow , a lot of information that covers all the basics - I was able to apply my knowledge from yestreday and could do a few things already on my own :) nice to see progresss. - - - - ##### ❓ One thing to improve - Increase your screen when you share. It is so difficult to read. - orientation at the beginnign with files, where to type, what is happening.. we had to get used to a new teacher (meaning: this requires a little time to get used to their way of explaining where we are...). - Maybe could have talked about other plotting tools/libraries. - - the need to always switch between screen is partially annoying - - I think should be good to show, the same thing could be made with another library. Today is good example when talking about plot. - - - a little bit more explanation about the code instead of going more further with difficuilty - - especially in the beginning was really difficult to follow, maybe some important steps would be better to be repeated so everyone can follow. If you stop somewhere its really difficut to continue after that - more breaks/more often ##### ❓ Any other comments? - - haha my time is 4am now - Whao, thank you for following us then. - I feel bad for complaining about being tired now! Impressive... No worries! great chance for me to learn. Thanks everyone! :) - I have some problems to see the difference between "()" and "[]" with my screen. Thx! - - I am in Europe and struggle with time past 8pm a little. + - sure, sure I understand the timing, just expalining I get tired :) - Suggestion: taking few 1 minute breaks at the beginnig for others to catchup - Thanks for your time and efforts - THANK YOU - RENATO! I guess you are out tomorrow.. - Thansk a lot and see you tomorrow - will you share the recordings of the sessions ? - Yes, we will. But only after editing them a bit, so this will not happen right away. Follow the project through its social and website to know when. ::: ## Day 3 - Thursday ### Schedule | Day | Tutorial | Instructor | | -------- | -------- | -------- | | 3 - Thursday | [Advanced programming](https://swcarpentry.github.io/python-novice-gapminder/11-lists.html) | Rabea | | Starting time | Duration | Content | | ------------- | -------- | --------------------- | | 18:00 | 10 min | Welcome + Summary | | 18:10 | 30 min | [Lists](https://swcarpentry.github.io/python-novice-gapminder/11-lists.html) | | 18:40 | 20 min |[For Loops](https://swcarpentry.github.io/python-novice-gapminder/12-for-loops.html) and [Looping Over Data Set](http://swcarpentry.github.io/python-novice-gapminder/14-looping-data-sets.html) | | 19:00 | 10 min | Break | | 19:10 | 30 min | [Conditionals](http://swcarpentry.github.io/python-novice-gapminder/13-conditionals.html) | | 19:40 | 20 min | [Writing Functions](http://swcarpentry.github.io/python-novice-gapminder/16-writing-functions.html) | | 20:00 | 10 min | Break | | 20:10 | 35 min | [Variable Scope](http://swcarpentry.github.io/python-novice-gapminder/17-scope.html) | | 20:45 | 15 min | Summary + Feedback | :::warning ##### ❓ Do you have any questions about yesterday's content? Q:Two questions, yes: 1. in the scatterplot the axis have automatic axes that are not the same. Therefor understanding correlations is not possible. How can I force the plot to have linear and same axis layout? 2. I tried making a line plot with two lines and tested this: `data.loc["Germany":"France"].plot()`, which did not work, but in general can I refer explicity to two countries? A: 1. I'm not sure if I understood your question correctly. Is your goal to have the same scaling for both axis? If this is the case you could try `plt.axis('equal')` 2. yes, they both should start at the same value and end at the same value 3. nope, that did not work... y still starts at 10K, and one goes in steps of 10K, the other in steps of 5K etc. I can use chatPGT for solving also :) 4. I see. I will try to solve that later if ChatGPT can not help you. 2. If you want two lines in one plot type the following in a single cell ``` data.loc['Germany'].plot() data.loc['France'].plot() ``` Q: A: Q: A: Q: A: ::: :::success ##### ❓ Before we start: Is the screen clearly visible (add '+') Yes + +++++++++ - Please zoom in - ::: :::warning ##### ✏️ Hands-on: Start up your or Jupyterlab Start your Jupyterlab instance and bring up the main page in your browser. Create a new Jupyter Notebook and name it `Day3` ##### ❓ Are you inside an empty Jupyter notebook? Add a '+' below - Yes: +++++++++++ - Need time: - Need help: ##### ❓ Do you need help? Please describe your issue - when we type, we do not at the same time see the yellow and green color, this is a bit hard for naviation. Colors are great when viewing only, but not when we hvave to edit. - THANK YOU Rabea, yes :) - we are not seeing what Rabea tpyes..!!!!! ++++++ - What's the format of the list values? float64? - In a list you can have mixed datatypes. Python assigns the datatype automatically which is float in this case. Typically float is 64bits in Python. - I got this err0r: name 'pressures' is not defined - no typo - Seems you have not defined the variable pressure. Did you execute the cell, where you created the variable `pressure`?yup I did not. Problem solved :) TQ - Is there a way to start index with 1 rather than 0? - Opposed to mathematics, computer scientists start counting at 0. This behaviour can not be changed in Python. - Thanks - can we arrange the list in ascending order? - you can arrange the list in any way you like. Technically the list holds pointers to the elements inside of it. - - - - ::: ### 1. Lists :::success ##### ✏️ Lists Type along with Rabea to learn about lists ##### ❓ Do you need help? Please describe your issue or question - How Do I append multiple elements? - I just found out by accident, run it again, now I have 7 7 .. - yes, but how to append multiple elements? - (i am a participant, nto instrucotor, no more news from me :) thx anyway :+1: - if you would like to concatinate two lists you could do it like this `two_lists = list_1.append([2,3,4])` - you can use `list.extend()` to append multiple elements to a list. Rabea will demonstrate it later. - - - - - - - #### Shared code if you want to create a list using a range ```lista = list(range(1,11))``` ::: :::warning ##### ✏️ Exercise 1 - Working With the End What does the following program print? ```python element = 'helium' print(element[-1]) ``` - 'm' +++++++ - - output the last element (the first from the end) +++++ - - 1. How does Python interpret a negative index? - s.a. - - Starts counting from the end. + - negative indices are used to access elements from the end of a sequence +++++ - + - - 2. If `values` is a list, what does `del values[-1]` do? - It removes the last item in the list, whether it is a number or a string. ++++++++ - Removes the first element starting at the end of a list - - error + - can we discss this, i tested it with a list (element = helium) and got aalos an error... + - What error did you get?typeerror - Then your variable was not a list nt isnt values a string? ``` TypeError Traceback (most recent call last) Cell In[50], line 1 ----> 1 del element[-1] TypeError: 'str' object doesn't support item deletion ``` - Yes, exactly! You can not use del with a string. The reason is, that strings are immutable. They can not be changed. Lists are mutable, thats why we can delete inplace. + 3. How can you display all elements but the last one without changing `values`? (Hint: you will need to combine slicing and negative indexing.) - - `values[:-1]` +++++ - - ::: :::warning ##### ✏️ Exercise 2 - Slice Bounds What does the following program print? ```python element = 'lithium' print(element[0:20]) print(element[-1:3]) ``` - 0:20 prints the entire word and inores that there are empty spots; -1:3 prints nothinh.. +++++ why though? - I think this just for convenience. Otherwise it could produce errors in some cases in your code. - lithium - - - - ::: :::info ##### ✏️ KEYPOINTS Lists - A list stores many values in a single structure. - use an item’s index to fetch it from a list. - Lists’ values can be replaced by assigning to them. - Appending items to a list lengthens it. - Use del to remove items from a list entirely. - The empty list contains no values. - Lists may contain values of different types. - Character strings can be indexed like lists. - Indexing beyond the end of the collection is an error. ::: ### 2. For Loops :::success ##### ✏️ For Loops Type along with Rabea to learn about for loops ##### ❓ Do you need help? Please describe your issue or question - We don't have to close the loop? 🤯 - Answered by Rabea: With the indentation you open and close implicitly - I just tried with "i" instead of number and it ONLY printed the LAST number... - you can use any variable name you like in the for loop for iterating over the list. i is the standard name for an index - but everytime I change from Number it just prints 5 5 5! instead of 2 3 5... - your code looks like this?: ``` for i in [2,3,5]: print(i) ``` - FOUND THE MISTAKE... yes;M i kept number in print-. Now wondering why it pritend anythign! - That was my guess :+1: - in R I am used to for i in.... - Yes, this is the standard name when counting up an integer in a loop in almost every coding language - I accidentally executed this: ``` for cat in [2,3,5]: print(number) ``` - why was there different result compared to Rabea's example, I don't understadn the result are 3 x 5? - What did you define as number? If this is just a value it will print this value 3 times. - ok, so probably the last result from previous code was stored in number still - exactly :+1: - - - Why does "range(0,3)" not include the 3? - It's just a convention. In some cases it can make things easier. - Did not understand though :) - I think it has to do again with counting from zero, for example range(0,10) would contain 10 Numbers (0-9). - You could also think of a start and stop value. In real life you would not include the stop value (at least this would be my intuition). - - - - I have a general question for Jupyter book: We have by now so many lists and vecotors and strings etc, is there a way to see my environment and see a list of all these that I created? some of my exercises really got mixed up as I forgot names etc.. - great question: I show you after the break - you could use `%whos`. It will print you all variables that are assigned - Oh shit, did I miss the answer? I was a back a minute too late.. sorry! #### shared code :) ´´´ nombre ="Leonardo" a = [] for number in nombre: a.append(number) print (number) print (a) ´´´ :smile: ::: #### Let's come back at 19:10 (CET) - Will there be another break today? :::success ##### ❓ Are you back? - Yes++++++++++++ - No ##### ❓ Any questions regarding what we did until now? - Q1: - A: --- - Q2: - A: ##### ❓ Is the speed fine? - Yes: +++++++++++ - Too slow: - Too fast: ::: :::warning ##### ✏️ Exercise 3 - Slice Bounds Is an indentation error a syntax error or a runtime error? Write your answer below and explain why. - it is a syntax error. ++++++ - - - ::: :::info ##### KEYPOINTS Conditionals - A for loop executes commands once for each value in a collection. - A for loop is made up of a collection, a loop variable, and a body. - The first line of the for loop must end with a colon, and the body must be indented. - Indentation is always meaningful in Python. - Loop variables can be called anything (but it is strongly advised to have a meaningful name to the looping variable). - The body of a loop can contain many statements. - Use range to iterate over a sequence of numbers. ::: ### 3. Conditionals ::: success ##### ✏️ If Statements / Conditionals Type along with the instructor to learn about conditionals ##### ❓ Do you need help? Please describe your issue or question - Is there any way to specify a condition for false cases? Like an "else" option. - Rabea will show it now. Of course you can write several conditionals with "else" or even "elif CONDITION" - I can't figure out why I have a syntax error pointing at the '>' sign. Any suggestions? - Do you have quotes around the number? - Can you copy your code here - ``` if mass is > 12.0: print(mass, 'is large') ``` - The "is" is not necessary. So it's ``` if mass > 12.0: print(mass, 'is large') ``` - Python will evaluate the expression `mass > 12.0` to `True` or `False` - That worked. Thanks! - - can I also use 3 (int) in the condition, or it has to be a float because the list contains falots? - Of course you can use an integer as well. You can use any type in that expression you like. If you compare two different types, for example float and int, in most cases Python is smart enough to compare them correctly. - thx - Can we do operations (2+2) in the "if"? - You can use almost every expression, that evaluates to a boolean value. (2+2) would evaluate to True. But the result will not be stored. - Like, for example? ``` if mass > (6+6): print(mass, 'is large') ``` - Depending on the value it will evaluate to True or False. Yes you can use a calculation in the expression. Python will evaluate the calculation before applying the greater-than Operator. - difference between elif and else? - You can use elif if you want to have multiple conditions. The else-block is executed if none of the preceding conditions is True. - Here is an code example: ``` if x > 10: print("x is greater than 10") elif x > 7: print("x is greater than 7 but not greater than 10") elif x > 5: print("x is greater than 5 but not greater than 7") else: print("x is 5 or less") ``` - - - #### Share code ``` mass = 3 if mass > 3.0: print(mass, "is large") elif mass == 3: print (mass, "is the same") else: print (mass, "is small") ``` ::: :::warning ##### ❓ Exercise 5 TRACING EXECUTION: What does this program print? ``` pressure = 71.9 if pressure > 50.0: pressure = 25.0 elif pressure <= 50.0: pressure = 0.0 print(pressure) ``` - 25.0 +++++++++++]+ - and when I re-run 0. then it stays. - it depends!! - adjusted pressure value based on true first condition + - - 0.0 - - - - ::: :::info ##### ✏️ KEYPOINTS Conditionals - Use if statements to control whether or not a block of code is executed. - Conditionals are often used inside loops. - Use else to execute a block of code when an if condition is not true. - Use elif to specify additional tests. - Conditions are tested once, in order. ::: ### 4. Looping Over Data Sets ::: success ##### ✏️ Looping Over Data Sets Type along with the instructor to learn about looping over Data Sets. Make sure to import pandas in your Notebook with `import pandas as pd` ##### ❓ Do you need help? Please describe your issue or question - but now we will only save the last file, no? - Rabea will explain it now - would regular expression works also to create a list of files? - For this case I would suggest globbing with the package "glob" (included in the standard library). It allows patterns that are similar to regular expression, but simpler. - - i got this error: ``` FileNotFoundError Traceback (most recent call last) Cell In[26], line 2 1 for filename in ["gapminder_gdp_africa.csv,""gapminder_gdp_asia.csv"]: ----> 2 data = pd.read_csv(filename, index_col="country") 3 print(filename, data.min()) [...] ``` - Make sure, the files are in the folder where your script is running. - it worked thanks - Sorry, I missed this: what does .min() do? - min() calculates the minimum values for each column! - What is I want to calculate the minimum value for each row instead of column? - just transpose the table :+1: - You can set the axis with `data.min(axis=1)` to rows :+1: Why did Rabea - - - - #### Info: - Standard Python Libraries: https://docs.python.org/3/library/index.html #### Let's come back at 20:10 (CET) :::success ##### ❓ Are you back? - Yes++++++++ - No ##### ❓ Is the speed fine? - Yes:++(BIG PLUS) +++ - Too slow: - Too fast: ::: :::warning ##### ❓ Exercise 6 DETERMINING MATCHES: Which of these files is not matched by the expression glob.glob('data/\*as\*.csv')? ``` data/gapminder_gdp_africa.csv data/gapminder_gdp_americas.csv data/gapminder_gdp_asia.csv ``` - the first one - gapminder_gdp_africa +++++++ - - - 1,2 - - #### Questions?: - i am confused by the backslashes - These are escaping the star in markdown syntax - YES! ahha! ok :) - I was looking in edit mode, not view mode... - - my files are in the same folder i am using jupyter, when i did this: ``` for filename in glob.glob('as*.csv'): data= pd.read_csv(filename) print(filename,'\n',data.min()) ``` I wasn't getting an output,do you know why? - I think you may be missing the first * before the word 'as'. :+1:Thanks ::: :::info ##### ✏️ KEYPOINTS Looping over Data Set - Use a for loop to process files given a list of their names. - Use glob.glob to find sets of files whose names match a pattern. ::: ### 5. Writing Function ::: success ##### ✏️ Writing Functions Type along with the instructor to learn about how to write functions. ##### ❓ Do you need help? Please describe your issue or question - could we use the date from the system in the function somehow? - Yes, you can. There is a package in the standard library named [time](https://docs.python.org/3/library/time.html). - Why it's not necessary to add an "else"? - You can leave it out. Internally Python will interpret it as an else case with a `pass` statement (which does nothing). It makes the code look cleaner if you don't need the else case. - Why use square brackets inside paranthesis? - square bracket represents a list. With paranthesis, you can call a function. In this case `average`. So this statement calls a average function ona list of three numbers. - Yes the square brackets are syntactic sugar in Python to create a list. The code can be interpreted as creating a list that is directly handed over to the function 'average()'. - feel a bit confused with the definitions of the functions.. - perhaps it was too fast - what if I want two arguments to be returned? - For returning variables there is the `return` - Statement (Rabea showed in the demonstration). For returning two variables, there are several ways. You could use a tuple 'return a, b' or alternatively use another data structure like a list. :+1: - - - - - - ::: :::warning ##### ❓ Exercise 7 ENCAPSULATION Fill in the blanks to create a function that takes a single filename as an argument, loads the data in the file named by the argument, and returns the minimum value in that data. ``` import pandas as pd def min_in_data(____): data = ____ return ____ ``` Together please. +++ ``` def min_in_data(file): data = pd.read_csv(file) return data.min() ``` I got this, but do not know how to call it: ``` min_africa = min_in_data("gapminder_gdp_africa.csv") print(min_africa) ``` ``` import pandas as pd def min_in_data(data): data = pd.read_csv("/Users/PY_Course_2023/gapminder_*.csv", index_col="country") return filname, data.min() ``` and then I tried bu gave error... ``` min_in_data(data) ``` - your shoud replace the path in the function with a variable containing tje file name. `data` is not good name for a filename variable. Better would be `file`. -- got it now, thanks ``` ``` ``` ``` #### Question? - Yes. Is there a way to specify that the string "file" will always be a string? For example, to use the function like "min_in_data(gapminder_gdp_europe.csv)" (without the "" in "gapminder_gdp_europe.csv"). - You mean, that Python checks for the type? - If that is what you want, no not possible. - No that is not possible. - What you could do: - Use [isinstance()](https://docs.python.org/3/library/functions.html?highlight=isinstance#isinstance) in the the function to check if `file` is a string - Python enables [type hints](https://docs.python.org/3/library/typing.html), but it's only annotating and not throwing any errors if you break the annotation. (Thanks! :+1:) - - ::: :::info ##### ✏️ KEYPOINTS Writing Functions - Break programs down into functions to make them easier to understand. - Define a function using def with a name, parameters, and a block of code. - Defining a function does not run it. - Arguments in a function call are matched to its defined parameters. - Functions may return a result to their caller using return. ::: ### 6. Variable Scope ::: success ##### ✏️ Variable Scope Type along with the instructor to learn about variable scope. ##### ❓ Do you need help? Please describe your issue or question - In this case, is better to define the pressure before calling the function or inside the function? - It depends. I would always suggest not using global variables. But you should know, that these can be defined. When using global variables you should have a good reason for that. If not, better try to keep variables in their scope. (Thx! :+1:) - - - - - - - ::: :::info ##### ✏️ KEYPOINTS Variable Scope - The scope of a variable is the part of a program that can ‘see’ that variable. ::: ### 7. Programming Style ::: success ##### ✏️ Programming Style Type along with the instructor to learn about programming style. ##### ❓ Do you need help? Please describe your issue or question - I am confused about the spaces - You mean this section? Can you be more specific? - yes you just showed thanks :) - - - ::: :::info ##### ✏️ KEYPOINTS Programming Style - Follow standard Python style in your code like [PEP](https://peps.python.org/pep-0008/) ::: ### Feedback :::success ##### ❓ One thing that was good about today - great thto e havea joined start and we were able to orient + - the pace was amazing I loved the session today and it was easy to follow. also the number of breaks was great. +++ - Having two breaks.+ - it was always clear where we were in the hedgedoc (again, we are getrting used to it... ) - I look forward to tomorrow! - Clear and nice explanations. Thank you a lot! - - - ##### ❓ One thing to improve - all good (I am also getting used to python, which makes day 3 easier than day 1 and 2... ) - - - add some examples from the bio/biomed topic (perhaps from your past real projects, but probably simplified for us) - the def function/return part was a bit faster for me - - ##### ❓ Any other comments? - - Tomorrow is until 9pm CET or we will finish earlier? - It will be until 9pm CET - -Today was amazing. Thank you! - Thank you. - - THANK YOU RABEA! say hi to your cat :heart_eyes_cat: - thank you and have a good sleep :wave: ! ::: ## Day 4 - Friday ### Schedule | Starting time | Duration | Content | | ------------- | -------- | ----------------- | | 18:00 | 10 min | Welcome + Summary | | 18:10 | 20 min | [Introduction](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/01-introduction/index.html) | | 18:30 | 20 min | [Authoring With Markdown](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/02-markdown/index.html)| | 18:50 | 40 min | [Hosting Pages on GitLab](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/03-gitlab-pages/index.html) | | 19:30 | 10 min | Break | | 19:40 | 30 min | [Work with Jupyter books locally](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/05-jupyter-books/index.html) | | 20:10 | 20 min | [Host Jupyter Books in GitLab](https://workshop-building-websites-with-gitlab-biont-eu-2e49af9c9c94c62.gitlab.io/06-jupyter-books-in-gitlab/index.html)| | 20:30 | 20 min | Add your notebook as a chapter | | 20:50 | 10 min | [Final remarks](https://docs.google.com/presentation/d/1QC6kOESKD84Bzq4UIxrdgbw7PL5WGq_ncUG9TXmM8jE) | :::success ##### ❓ Before we start: Is the screen clearly visible (add '+') Yes+++++++++ - its very small, I cannot read + AH, then itsa fine... Please zoom in + - ##### ❓ Before we start: Do you have Jupyter book and Git installed? Run `git --version` and `jupyter-book --help` in the same prompt (either the terminal or the Anaconda prompt): both these commands should give some output. - Souldn't it be `jupyter-notebook --help`? - jupyter-book is something different, that we will be using today. It will be explained during the course. - How can we check again that it is installed? - It is written above, just type `jupyter-book --help` either in the anaconda prompt or your terminal (depending on your system) - what should i get if i have installed? - Done +++++++* - - I need help + - With the installations? - is the jupyter differnet from yestreda? - Yes - Do you have conda? - try`conda install -c conda-forge jupyter-book`, if this takes too long, you could also try `pip install jupyter-book`. Both can be executed in your terminal or conda prompt - its installing... . I know my pip does NOT work.. (longer story) :+1: - how can I test of installing was successaful? - try `jupyter-book --help` - it gives me Usage: jupyter-bookk... m - This should be fine it should give you a help page with options how to run the jupyter-book - I don't have jupyter book! +++ - Do you have conda? Yes - than try `conda install -c conda-forge jupyter-book`, if this takes too long, you could also try `pip install jupyter-book` - It's installing (Solving environment). - Depending on your system (Windows and MacOS) it can take a bit of time to install it using conda. The jupyter-book will be needed later in the workshop so even if it takes 30+ minutes, you will be fine. - sorry i am confused, we should open jupter lab ? and gitlab ? - Not yet, currently we test whether you have jupyter-books available to you. Just type: `jupyter-book --help` in your terminal or anaconda prompt. - (base) C:\>jupyter-book --help 'jupyter-book' is not recognized as an internal or external command, operable program or batch file. I get this error - Then it is currently not installed on your system. Please run `conda install -c conda-forge jupyter-book` - thanks. I ran this but i didnt get any message - I assume you are using Windows ? yes - Ok maybe try to close your command prompt and open it again as `run as administrator` by right-clicking when opening the prompt. That way we will ensure that we do not run into permission issues. Then try the command again. - I did "pip install jupyter-book" and it's installed, but then I did "jupyter-book --help" and it's not working :( - What operating sytem do you use ? - Windows - Ok, that might mean that the paths are not correctly set, what you can try is to close your prompt, then when opening a new run, right click and select `run as administrator`. Then try again and see if it works. - Now It works, thanks! :D - Excellent :) - the gitbal is WEBSITE based right? - gitlab is website based yes. But most operations can also be run using the command prompt. But for now please stick to the webpage. - In gitlab.com, shall I open the 'zero to Hero project'? - Not necessary at this point - can we have break till 8:15 ? I think this would be better 3 :::success ::: ### Introduction ::: warning ##### ✏️ Hands-on: Set you up - Login to GitLab +++++++++++ ##### ❓ Do you need help? Please describe your issue - i need a third screen.... so please bear in mind that some of ius have to swiotch back and forth... - hedgedoc! - thank you lisaanna for slow start allowing us to catch up + - - what is the difference between Gitlab and Github? Thanks! :+1: - Both are platforms built on top of git (an open source software and freely available). Each of the two (GitLab and GitHub) is run by a different company. They are a little different but essentially meet the same needs with similar functionality. - For an in-depth list of differences, see [this comparison page](https://about.gitlab.com/competition/github/) provided by the same people that created GitLab. - Looks visible - - can we create the website also using python? :+1: - There are many systems that allow you to create webpages, but they are all using HTML in the background. You cannot use python to replace HTML to build a page. But there are python libraries that allow you to create webpages, these libraries create HTML / CSS in the background for you. But in general you have to keep in mind that a webpage is always using HTML (and/or CSS and more). - thank you for extensive reply! It sounds interesting to explore in the future - You are welcome :) It is a vast topic and the amount of availabe framworks for HTML are overwhelming. It is an interesting topic to check out :) - - ::: :::success ##### ❓ Watch the instructor creating an HTML file and opening it in the browser. Don't follow along, it's not needed. Do you have any question about HTML files? - I'm not sure if I unerstood correctly the difference between static and dynamic. - For static pages you have your HTML already created in advance there is no interactive parts that will create new HTML based on your user interaction. - For example if you have a documentation, it is static. Everything is setup in advance and nothing is changing. (No calculations in the background that create new pages) - If you have a page like google.com if you search something, there is a program running on the google webserver that processes your search, then creates new html/css code with the results and link you to this page. - Usually if you have an account of any kind you will also need a database on the background, this is also an indication that the webpage is dynamic (like gitlab.com or any page like facebook and co, where you have a user account). - I hope this makes it a bit clearer. - Be aware that static pages can use javascript, so you can adapt the existing HTML code (showing more or less) but you will not do complex calculations and add more pages using this. ![](https://biont.biobyte.de/uploads/f27eea4f-c473-40c7-b571-dffa1c8b08b0.png) - - - - ::: :::success ##### ❓ Given the following types of websites, reason if a static site generator is an appropriate solution to implement them. Add an x next to those that you think are a good example of static website case. - (1) A personal website with About and Projects sections +-++++xX - (2) A forum or discussion platform + - (3) A community blog or news website - what when users can add comments? - (4) A search engine (such as google.com) ? - (5) A wiki (such as wikipedia.com) +++++×X - (6) An online book + ++×x+X #### Question - is the imf website with multiple chatrs for users to ineract with, static? - As soon as you have users on a webpage, you need some kind of database to handle logins and create personalized pages in the background. That means it is dynamic. + ::: ::: warning ##### ✏️ Hands-on: Your first project - Create a new project - Call it `Data analysis`, the project slug will automatically be set to `data-analysis` - Check the `Initialize repository with a README` option ##### ❓ Do you need help? Please describe your issue - it says "create a project ""... - Yes you can klick here - I dont have a dropdown menu under PRoject URL - If you are not assiged to any group in gitlab it is likely that you do not have a drop down menu. If you see `https://gitlab.com/username/projectname` it is fine (`username` will show your personal username) - - - - - - ::: ### Authoring With Markdown ::: warning ##### ✏️ Hands-on: Write your README - Write your `README.md` file in Markdown (the same language of this collaborative notepad) - Save a version of this project with your first commit ##### ❓ Do you need help? Please describe your issue - I am getting following notification "You can't push or pull repositories using SSH until you add an SSH key to your profile." + - This is fine, you can ignore this notification for now. SSH is a convenient protocol to interact with git and remote computers but we won't need to use it today. - How can we write in a new line? Ahh, is with a blank line between the two text lines. - In what file do you want a new file, usually you should be able to use Return to jump into the next line if you are in the Edit mode. - The instructor is showing you the web IDE right now, which is far more intuitive as it resembles a text-editor a bit more. - Ah sorry, I get now what you mean, yes in markdown you need to add a blank line if you want the resulting rendered view to have a hard line-break - You can also add two spaces at the end of the previous paragraph - I get an error message: "The editor could not be opened due to an uexpected error: Unable to resolve resource walkThrough://vscode_getting_started_page" (I guess its fine, it is just not loading the start page) - Ok, check whether you can follow the instruction from Lisanna. If not, please write again and we will try to help. - sorry, I missed how to get the yml document..? (got a distraction runnin around here) - (:heart: to the distraction) You can click on the add file button and you will be asked to enter a name. You will then call it `.gitlab-ci.yml` - i got it but it does not show the icon, and does not have the highlighting - Is the extension of the file `.yml`? And did you already save the file? - it is so small it was NOT . but ;yml... thnak you - you can use `Ctrl`+`+` to zoom in and make the font larger - i got a green colour "stage" - success!!! but we will see more about this soon. - how did we get the index become a subheader of public? - You can right-click on the public and should be able to add file there. - hm, no... these are right click options.. - Ahh you created a file and not a folder. your public is currently a file. Delete it and create a folder (directory) called public. Then it should work. - SOLVED. it looks the SAME on the screen.. - Yes, it can be confusing at first. In Visual Studio Code (which this IDE is) you see that it is a directory when it has an arrow in front of it ">". - - - mine failed + - can you be more specific you could not push? - maybe yeah how to move from the black page to the main one - Unable to create pipeline `jobs: only has to be either an array of conditions or a hash` - Can you copy-paste here the content of your `.gitlab-ci.yml` file? ``` pages: stage: deploy script: - echo 'Nothing to do...' artifacts: paths: - public only: - main ``` thats the index: ``` <html> <head> <title>Home</title> </head> <body> <h1>Hello World!</h1> </body> </html> ``` - Does the first line start with `- pages:`? If so, remove the `-` so it reads only `pages:` and not `- pages:` - there needs do be a space between `-` and `main` so it reads `- main` not `-main` - Also note that much like in Python, in `yml` files the indentation is important and meaningful. - NOW IT PASSED. Thanks - Great!! - what to do after it passed ? - i do not see the "pileline" in the sidebar... - could you just show again how to click the last step? - AH i di dnot know it was under "build"! It says passed :) ::: ### Hosting Pages on GitLab ::: warning #### `.gitlab-ci.yml` * https://docs.gitlab.com/ee/ci/lint.html ```yaml pages: stage: deploy script: - echo 'Nothing to do...' artifacts: paths: - public only: - main ``` #### index.html ```html <html> <head> <title>Home</title> </head> <body> <h1>Hello World!</h1> </body> </html> ``` #### Have you been succesfull? ++ it works ++++++ For me it shows Failed!!! +++ #### Questions - ***Can you show again??*** - What part specifically? all ready, thanks I am lost starting from the deploy part. I dont have other files beside readme after going back to the data analysis page - mine says failed...any idea why, it says yaml invalid+ - I got the error and it was that the 'only' was in the same level as 'paths', but it should be inside 'artifacts'!!!!!!! - I copied the text of here and now it works, but it's because the "echo" and "public" parts where in a different position (one TAB less). But it's not the same structure than the instructor!! - Great that it works, yes identation is important. Try to really have the same structure than is shown by the instructor. The amount of spaces should not be important as long as you keep the same style over the whole file. So you might have slightly more indentation than the instructor. - break until 7:15 berlin - yes ## Material - https://en.wikipedia.org/wiki/YAML - Yaml validator https://www.yamllint.com/ (of course Gitlab has one inside under the menu "Build" > "Pipeline Editor") - :+1: The pipeline failed due to an error on the CI/CD configuration file. - can you copy your .giblab-ci.yml - Success I just copy pasted from above! - :+1: - Where did you click to go to the website? - Deploy > Pages ::: #### Let's come back at 19:15 (CET) :::success ##### ❓ Are you back? - Yes +++++++++++ - No ##### ❓ Any questions regarding what we did until now? - Q1:how to go back to data analysis page from web ide? - A: Usually the IDE opens in a new tab, so the gitlab site should be still open in another tab. --- - Q2: How can I open my new webpage? THX! :+1: - A: In the side-panel you have: `Deploy > Pages` , you will find the link to you page there. --- - Q3: If i want to change the text (the Hello world!) do i need to change the index.html inside public, then do the Pipelines thing and the deploy pages? - A: Yes, instructor is currently showing. You do not need to redo all the steps again. You just change the index.html and push it again. The rest will now update automatically as we have set up the `.gitlab-ci.yml` accordingly. `git` will realise that you have done changes to the file and will trigger the page-building again. --- - Q4: ups, i got a 404 when I do deply > pages. I am sorry I di dnot see this before. - A: This can happen if the page is still building or currently updating. Wait a minute and try again. If you still see this issue please let us know. - in build pipeline it says pass and this also is true for the page we build 30 min ago.will go and check in a minute or two... - I had a similar issue earlier, even though it was passed it needed some time to show the page. It might be due to file-latency on the gitlab server. - You can also try to close the tab and open the link again, it might be some caching issue. - still same. - Strange, what you can try is go to `Build > Pipelines` then click on the passed symbol and then on the circular arrows to rerun the pipeline. This will trigger a new run and it will hopefully work. - tried, and still no... :-/ - Ok very strange, but have you seen the page at all today ? So before the last changes ? - no, I thought seeing the build > pipelines > passed was enough :) - Mhh ok this is hard to debug. It seems that you have the correct `.gitlab-ci.yml` file, otherwise it would not trigger. Can you check that the public folder is set in this file ? And can you check that it is called exactly the same in your project folder. So the `index.html` should be inside the `public` folder. (i am trying to follow along, so excuse my absense... ) yes, index inside pulic. BUT: not sure if it is called index.html it seems to just say index - no worries :) It is good that you try to keep up:) ![](https://biont.biobyte.de/uploads/e77435e5-2ddd-4b01-a88a-53e5e72b466b.png) ![](https://biont.biobyte.de/uploads/bf38fe6c-c632-4156-a20f-2190444d6f75.png) - Yes this is the issue. You should rename it to `index.html`, because the extension is important for the builder to know that this is already in HTML and that it can be provided as is. I am sure that it will work afterwards :) - YESSS successss!! thank you! jumping back to jupyter book now... - Nice great that it works :) --- - Q5: - A: ##### ❓ Is the speed fine? - Yes:+++++++++ - Very good speed! - Too slow: + - Too fast: ::: :::success ##### ❓ Did you notice how a commit is something more than a simple save? We will not discuss Git in details today, but it is a wonderful tool to manage the versions of a coding project. Next thing to learn on your list! Please add comments below. - + - why gitlab and not github? - Both are valid options. Gitlab is an open-core company, while GitHub is owned by Microsoft. It is a personal choice. - Are there aslo technical differences which make gitlab better for free usage? - There are some differences between both platforms, GitHub is older and might have some more features available. It also has a paid version with additional features. But GitLab tries to offer all the options you have in GitHub. In my opinion GitLab currently has the more intuitive integration of pages but the differences are minimal. - ::: ::: warning ##### ✏️ Hands-on: Check your website: it's alive! - Navigate to the pages to see your live website + ##### ❓ Do you need help? Please describe your issue - - - - ::: ### Work with Jupyter books locally ::: warning ##### ✏️ Hands-on: Creating a Jupyterbook locally - Create a jupyter book template with `jupyter-book create ...` - Run it locally with `jupyter-book build ...` ##### ❓ Do you need help? Please describe your issue - pwd does not work with anaconda prompt :( - perhaps it is shown there already on the line? - Yes but, how can I check the absolute path? Yes, "cd" works! :+1: - command not found (error) - if you using anaconda prompt use cd? i am using git bash - Is it okay if I work in different directory than Desktop? - yes you should just know where you created it - - does jupyer-book also support qmd files? - As far as I know there is currently no support for qmd files. You may can used them in [Jupyter lab](https://quarto.org/docs/tools/jupyter-lab.html). But no expert here. - What happens if I have the same file name wit htwo different file types? E.g. markdown .md and markdown.ipynb? (because in the `_toc.yml` the file names are not woth the suffix) - Apperently jupyter book can automatically detect them. So if you add markdown twice in the `toc.yml` file it will prioritize the `.md` file and for the second include it will use the notebook. But if you can avoid using the same name, I would highly suggest you do. - Can I use latex with jupiter-book? not only markdown - You can use latex inside of markdown, but as far as I know you cannot use latex only. In markdown latex can be added using the `$` symbols $E = mc^2$. - Why I have an "extra file" called "- file: markdown-notebooks"? - It might be a versioning thing. I also have this file. In our version this might be the new default. - ls does not work in windows - ls is not recognized as an internal or 1external command, - Try to use "dir" - :+1: - - which commands did we execute? I was busy trying to fix my website deplpy... - `jupyter-book build jupyter-book demo` ## - yes, I can see it ++++++ I see that includes a search function. Is it configurable? After you run the jupyter-book build example, then when you navigate appears the static site and a search logo, where you can put a text. I made a own page, with haskell content. Then test the search function and return the page. Seems that is a included feature to make statit search. But I do not know how can we configure or the deep of this static search function. - can you give more background where do you see a serach function? - In general you can customize a lot about this using javascript. But sometimes it is challenging to change these things in precreated pages like gitlab pages. I would have to look this up myself. But you can adapt both the style and the javascript of jupyter-book. I checked it a bit. It seems to be possible to change this but it seems to be quite hard. - I don't have a public button anymore ++ - Did you open a 'new project'? and then select 'blank project'? yes, I did + - do you see a heading: 'Visibility Level'? This should be in the midel. And here you usally have the option public or privat. - yes but this time there is only the Private option. - Than mybe it is something with you settings I would also need to check - ![](https://biont.biobyte.de/uploads/10d498bc-31e3-46e1-886d-6ef6fb1adbb1.png) - Did you select your personal ID for creating the project? For the figure above e.g. it is teresa-m - - - - ::: ### Host Jupyter Books in GitLab ::: warning ##### ✏️ Hands-on: Hosting your Jupyterbook remotely - Create a new project in GitLab - Push your local version of the book to the remote project to make it public ##### ❓ Do you need help? Please describe your issue - why without readme this time? thanks! it helped a bit - This avoids creating an initial commit already. That way we can more easily add our jupyter bookto the webpage. - sorry, was getting a question answered, what was done after creating the new project - Is this resolved ? or are you still having trouble to follow? - still having trouble - What was the last step you did ? - creating the new project on gitlab - So you create the jupyter-book locally and then created a new project? - yes - Ok then you need to follow the instructions on the page, I will try to add them here again. ``` cd <existing_folder> git init --initial-branch=main git remote add origin https://gitlab.com/<your name>/my-book-website.git git add . git commit -m "Initial commit" git push --set-upstream origin main ``` - When I do "git commit -m "Init Jupyter book"" i obtain this: "fatal: unable to auto-detect email address" :( - It should show you some instructions on how to set your email. - Yes but how can I solve this? (YES, WITH THIS COMMANDS IT WORKS!!!) - git config --global user.email "your_email@example.com" - git config --global user.name "Your Name" - These commands should help you set your name and email. This you only have to do once. - I got different message now: Author identity unknown *** Please tell me who you are. +1+ - Did it specify which commands you need to execute to define your identity? - yes, to provide email and username - git config --global user.email "your_email@example.com" - git config --global user.name "Your Name" - also says plese tell me who you are and fatal: unable to auto-detect email address - git config --global user.email "your_email@example.com" - git config --global user.name "Your Name" - These commands should help you set your name and email. This you only have to do once. - maybe we can make it a 5 min break :) - also, i use breaks to re-check my documents and make sure I am all goo d, so not sipping cocktails while you work haha #### Tips - if somebody made the same mistake as me and mistyped the url for remote add origin: check with "git remote -v" and then change url with "git remote set-url origin `<correct url>`" Another way: you can edit directly the .git/config file and change the url with (my example: https://gitlab.com/lbauchwitz/haskell-book.git) then, when you make git push, it ask to you about your username and password. To see the url, you can go to clone button and you can see the URL in the box "Clone with https" Copy this value to the .git/config file inside url key. ``` [remote "origin"] url = https://gitlab.com/lbauchwitz/haskell-book.git ``` -It worked for me. ++++ - I am not sure,but seems yes + - I want to add +, but not sure actually: I see all the documents in my gitlab projects, but I cannot see a website and build > pipelines does not say passed- may I have to actially select a website template fist? it gives me lots of options below... - I think this will still come, we will have to add a new `.gitlab-ci.yml` specific to the jupyter-book setup. Ok this is still comming now. to which folder? - two factor sucks so much..... ::: :::success ##### ❓ Are you back? - Yes +++++++++ - No++ ##### ❓ Any questions regarding what we did until now? - Q1: What is the instructor doing now? - A: She will explain you now how the gitlab reposetory should look like - Would we be able te see and recover also previous versions? - Yes GitLab is a version control system, so you can check all previous commits. But you will only be able to see the old files not the old webpages. But you could revert to an older version and rebuild the old file. - There are ways to keep the old versions around but that can be confusing and wasteful. We saw how the public folder is used, you could have different subfolders. --- - Q2: For me all exected well but I don't see anything new in the gitlab folder? - A: Did you refresh the webpage? After uploading the content from your local folder you will have to refresh the page to see any changes. - Yes! still no new files. - Ok, so you followed all the steps to upload to gitlab and you were in the folder containing all your files (not previous folder)? - Yes I am in the folder jupyter-book-demo. - Can you make sure that `git remote add origin https://gitlab.com/<your name>/my-book-website.git` - error: remote origin already exists. It gives this error. - use `set-url` instead of `add` -> `git remote set-url origin https://gitlab.com/<your name>/my-book-website.git` - Yes No error for this. - points to the correct repository. If you made a typo here, it will not work. You have to adapt both the `<your name>` and the `my-book-website` part to you specific gitlab directory. - And did you get any error at any point? If so can you copy it here? - Now it works, thank you. - Great! - Can you show to me your .git/config file? See preferences in your repository - where should the file be, in build folder?? - In the main folder, not in the build folder - Q: problem: "failed". I re-run now - still failed... I have these files: - What is the error? ![](https://biont.biobyte.de/uploads/d300f39c-565a-4c6f-9e15-4646c27e2d52.png) - so my gitlab-ci.yml is there. The next thing I tried was tehn to deploy and that does not work. it keeps saying "failed". - Try clicking on failed. What error shows up? - A: Did you get an error message? - in Build>Pipelines it says FAILED as status. But no error message was send. The content of the yml file is copied from the previous project.. - The .yml file needs to be adapted to the current project, I will paste it underneath (line ~ 756) - GOSH, i wanted to avoid an error from copying from website and thought I was smart. - no worries :D These things happen to the best of us :) - success...phew - Q: i also got an error: Initial commit - nothing to commit (create/copy files and use "git add" to track) - can you check with `git status` if you would need to ` add` a file first - You might have missed the `git add .` - I run it - What does `git status` print for you? ---- - Q: - A: ---- - Q: - A: ::: :::warning ##### ❓ Is the speed fine? - Yes: ++++++ - Excellent! - Too slow: - Too fast: ++++ ::: :::success ##### Updated `.gitlab-ci.yml` file to build the jupyter-book remotely ```yaml pages: stage: deploy image: python:slim script: - pip install -U jupyter-book - jupyter-book clean . - jupyter-book build . - mv _build/html public artifacts: paths: - public ``` - it seems to work ++ - All works fine - it works but my page looks different than yours (light background, toc contains: "Welcome to your Jupyter Book", "Markdown Files", "Content with notebooks", "Notebooks with MyST Markdown")) - This is fine, Lisanna is using a dark/night mode on their computer ::: :::success ##### ❓ Your time to experiment with a template This template is deliberately minimal to give you the opportunity to test your documentation reading skills. Check the topic guides at [jupyterbook.org](https://jupyterbook.org/intro.html) and find a way to: 1. Add another page called "About" and linked from the table of contents. 3. Play with the file format of this new page, add the same type of content in MarkDown, reStructuredTex and Notebook formats. 4. Add one figure and a figure caption. 5. Insert a code cell. If you are familiar with any programming language, add a simple plot and visualise the output of such plot into your page. 6. For more advanced code features, check [how to make the code executable](https://jupyterbook.org/interactive/thebe.html) 7. Check the [gallery of Jupyter books](https://executablebooks.org/en/latest/gallery.html) for inspiration! Add your comments below: - can't find the steps+++ - which steps? - the first one :-) - Try adding a new file to the repository with the name mentioned in point 1 - possibly with one of the extensions we used today - does [this page help?](https://jupyterbook.org/en/stable/file-types/index.html#allowed-content-types) - - - How do I go about if I want to add a file from my computer (not the Web IDE)?+ - Have we seen this today? - 1. Create a new file e.g. copy and rename an existing one - 2. Add your file (also changes) `git add <your file>` (`git add .` adds everything in your current repo. Be careful!!) - 3. Commit your changes: `git commit -m "message"` - 4. Push your changes: `git push --set-upstream origin main` (`git push` after the first push) - like that?: ``` git add . git commit -m "****" git push --set-upstream origin main ``` - `--set-upstream` is also something you only need to do the first time you push or pull, after that `git push` and `git pull` is sufficient. - You don't have to commit befoe you push? - you have to always commit your changes in git. - - I have the following error: (me too) ``` ! [rejected] main -> main (fetch first) error: failed to push some refs to 'https://gitlab.com/XXX/my-jupyter-book.git' hint: Updates were rejected because the remote contains work that you do not hint: have locally. This is usually caused by another repository pushing to hint: the same ref. If you want to integrate the remote changes, use hint: 'git pull' before pushing again. hint: See the 'Note about fast-forwards' in 'git push --help' for details. ``` - Lisanna is explaining why this happens. The cause is that you have changes in GitLab that you don't have locally. - And how can we solve this? - Usually git suggests some things to do. Error messages try to be helpful. - can you try `git pull --set-upstream origin main` - git pull helped :+1: - :::success + https://haskell-book-lbauchwitz-70b64d763ff36617c63a625e0bd4972289cf81f.gitlab.io/about.html ::: ### Feedback :::success ##### ❓ One thing that was good about today - I have not used before jupyter-book. I think is great. Really the teacher talk very clearly. - I very much liked in the whole course the tireless help on the hedgedoc .. everyone of my questions was answered and all my issues solved!! THank you to the entire team. ++ - :heart::heart::heart: - Today, i loved learning yet more tools and workflows.... on day 1 I almost cried being confused about them all, today I happily installed yet another jupyter-something-something! + - to see where to go next with learning coding in python - Very interesting session. Thank you a lot! - Thank you!I learned a lot again :snake: - Really thanks, great team. + ##### ❓ One thing to improve - Probably I hope more python applicated to bioinformatic, but of course, is a introductory course. I am very grateful - We taught some basics (gave you the tools), the art is now your part :blush: - See also the project [Rosalind](https://rosalind.info/problems/locations/) if you want some bioinformatic challenges. - :heart: - Great! Thanks - sometimes Lisanna internet connection had some trouble (pauses in speech) - - I also agree on our hunger for bioinformatics related tasks ##### ❓ Any other comments? - I believe that being able to say somthing in zoom, and NOT being anonymous, would have been great. Day 1 an Day2 we had troubles with hedgemd, so when you dont have the right link you can never even write a request for help there. -Looking forward to the write-up on Gitlab... ### THANKYOU!!! have a nice weekend all of you trainers! Bye :wave: ::: ::: warning ##### ✏️ How to get a certificate If you would like to get a certificate for this workshop please follow these three steps: 1Looking forwrd to a write-upp on gitlabs. Complete the post-workshop survey: https://survey.bio-it.embl.de/284869?lang=en 2. Find your notebooks of day 2 and day 3. If you call them different as 'day2' and 'day3' please rename the files to 'day2.ipynb' and 'day3.ipynb'. 3. Send an E-mail with your unique personal identifier and the two notebooks attached to: muellert@informatik.uni-freiburg.de Deadline: Monday 27th of November at 15:00 (CET). Please kindly note that this deadline is final and later submissions will **not** be taken into account! - I wont need any certificates. ##### ❓ Questions? - Where do i find again my unique identifier? - You had to create it in the pre and will have to create it in the post-workshop survey. You will create it like described below(*) - - - Are the recorded sessions available? - ::: (*) Unique identifier: Number of siblings (as numeric) + First two letters of the city you were born in (lowercase) + First three letters of your current street (lowercase). This will help us match your pre- and post-workshop answers in an anonymized way. -Thank you for this workshop.+