Intro Guide: Web Application Security

Web Application Security: hands-on intro

ditact women's IT studies, September 2024

jackie / Andrea Ida Malkah Klaura <jackie@tantemalkah.at>

https://tantemalkah.at/2024/web-app-sec-ditact/

Creative Commons License All contents, unless otherwise noted, were produced by Andrea Ida Malkah Klaura
under a Creative Commons Attribution - Share Alike 4.0 International License.

whoami

whoami

  • Andrea Ida Malkah Klaura (if you need a legal name)
  • jackie (in almost all other cases)
  • Pronouns:
    • she/her in binary gender space
    • ze/hir in the wider multiverse
  • Working mostly as open source engineer @ dieAngewandte
    • focus: backend with Python/Django and DevOps-y stuff
  • Also trying to clone myself to do:
    • teaching web-based game prototyping and machine learning @ dieAngewandte
    • and feminist technoscience (studies) @ TU Wien
    • some side web projects for NGOs (mostly WordPress based)
    • organising feminist linux meetups and other community stuff
  • More details on tantemalkah.at

what the hack...

...is all of this about?

  • brief intro to Dynamic (Web) Application Security Testing (DWAST)
  • aka (web) pentesting
  • aka ethical (web) hacking

web applications

(Source: https://xkcd.com/869/ , under CC BY-NC 2.5 license)

How does a web server work

  • A web server is basically a programme
  • that listens on one or more TCP ports (usually 80 and 443)
  • waits for requests from web clients
  • provides whatever information the client requested if it is available on the server
  • client and server talk follows HTTP protocol (HyperText Transfer Protocol)
  • nowadays mostly HTTPS, which is HTTP over a TLS-encrypted session

HTTP

  • is a plaintext protocol
  • you could just use telnet on the command line to speak to a server: (Source: https://commons.wikimedia.org/wiki/File:Http_request_telnet_ubuntu.png , under public domain)

πŸͺπŸͺπŸͺ Cookies πŸͺπŸ‘ΏπŸͺ

  • used for lots of questionable stuff
  • but also because HTTP is state-less, e.g. to remember logged in users
  • HTTP responses can contain cookies, e.g.

    
                      HTTP/1.1 200 OK
                      Content-type: text/html
                      Set-Cookie: sessionToken=abcdef01234567890; Expires=Tue, 31 Aug 2021 12:34:56 GMT
                      Set-Cookie: foo=bar
                      Set-Cookie: chocolate=good
                      Set-Cookie: raisins=evil
                    
  • In the next request to the same site the browser includes those cookies

    
                      GET /admin.html HTTP/1.1
                      Host: www.example.org
                      Cookie: sessionToken=abcdef01234567890; foo=bar; chocolate=good; raisins=evil;
                    

What happens when I visit a website?

excerpt from Sending Passwords on Postcards, recording available on YouTube

  1. I enter https://diebin.at into my browser and hit return
  2. My browser asks the DNS server "What's the IP address of diebin.at?" and waits for a response - "Hey there, it's: 176.9.22.182!".
  3. My browser sends a first HTTP request packet through the internet to the IP address 176.9.22.182
  4. The webserver running on 176.9.22.182 receives the request and sends the website of "diebin.at" as a HTTP response.
    (one or more response packets are sent, depending on the size of the webservers reply)
  5. My computers receives all the packets, assembles them according to TCP/IP protocol, and hands them to my webbrowser
  6. My browser shows me the response in the form of a website

wait, wait! is that all?

  • Imagine this line in your HTML page, that the web server just sent us:
    
                      <img src="chockie.png" alt="Picture of a chocolate cookie" />
                    
  • My browser requests all the images used in the HTML page that was just loaded
  • My browser requests all the CSS files that are used to style my HTML page
  • My browser requests all the JS files that are used to make my HTML page nicely interactive
  • ... and potentially lots more of (auto)magic stuff ✨πŸͺ„βœ¨

web sites vs.
web applications

  • mostly static hyperlinked content vs. highly interactive programme accessible through a web browser

(Source: https://de.wikipedia.org/wiki/Datei:Webanwendung_client_server_01.png , under CC BY-SA 3.0 license)

some security basics


(Source: https://xkcd.com/538/ , under CC BY-NC 2.5 license)

Terminology

the CIA triad

  • macro level key goals in Information Security
  • Confidentiality
    • a property of an information system that ensures that users can always only get those informations from the system for which they are authorized to read them
  • Integrity
    • a property of an information system that ensures that data can be changed only by users who are authorized to do so
  • Availability
    • a property (or the degree thereof) that an information system is providing functions whenever it is supposed to provide those functions

Access Control

  • differentiating between different users of an information system
  • Identification
    • a process of claiming that someone is who they say they are
    • e.g. by providing an ID card (IRL), or a username (on the web)
  • Authentication
    • a process of actually verifying that the identification is valid
    • by something you know: e.g. a password or passphrase, a PIN or some other code, an answer to a question, …
    • by something you have: e.g. a token, an ID card, an RFID chip, …
    • by something you are: e.g. finger print, the scan of your retina, …
    • use more than one property (MFA - Multi-Factor-Authentication)
  • Authorization
    • a process to check whether a specific user is allowed to execute a specific action (e.g. read a document, upload a file, change an existing file, send a message, …)

bad, worse, worst

  • Vulnerability
    • "A vulnerability is a hole or a weakness in the application, which can be a design flaw or an implementation bug, that allows an attacker to cause harm to the stakeholders of an application." (OWASP: Vulnerabilities)
  • Exploit
    • "a piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug or vulnerability to cause unintended or unanticipated behavior" (Wikipedia: Exploit (computer security))
  • zero day exploit πŸ‘Ώ
    • An exploit that is "unknown to everyone but the people that found and developed them" (ibid)

where do vulnerabilities come from?

  • faulty/buggy/insecure code
  • faulty/insecure use of code
  • faulty/insecure/missing systematic approach to application development

faulty/buggy/insecure code

  • because developers do not know how to write secure code
  • because developers are working under too much pressure to satisfy deadlines and deliver code

faulty/insecure use of code

  • because administrators do not know how to secure the service
  • because documentation is bad and it is unclear how a clean (and secure) setup looks like
  • because everyone likes to throw around with the buzzword label DevOps, but in practice there is no systematically engineered process in place to assure the quality and security of complex application systems (this is why there now is also DevSecOps)

faulty/insecure/missing systematic approach to application development

  • no one really cares about security
  • management layers are ignorant of the importance of security and do not allocate the appropriate resources
  • security is not integrated from the start on (e.g. with a secure development life cylce, short SDLC)

the making of a vulnerability

...just some examples...

  • in planning & design desicions:
    • trusting the user('s browser)
    • using outdated or self-coded encryption algorithms and insecure protocols
  • while coding:
    • incomplete or missing input validation and output sanitization
    • insecure methods and queries to access a database
    • parsing user provided files (including configs from admins) without precautions
    • hardcoding credentials
  • at run-time:
    • debugging mode in production systems

Cornerstones to remember:

  • There is no "100% secure"!
  • tools and processes are available to integrate security from the start
  • but they need time, experience, and in consequence also a lot of money
  • information security is not a specific state you can reach
  • rather a chance/probability that you can increase by appropriate measures

stats & context

... we'll return here at the end of the course ...

OWASP Top 10

  • https://owasp.org/www-project-top-ten/
  • THE authoritative survey on vulnerability prevalences
  • by the Open Web Application Security Project
  • 2017: Injection on place 1, XSS on place 7
  • 2021: Injection (including XSS) on place 3

Example: WordPress

Source: The Wordfence 2023 State of WordPress Security Report

  • Top 5 vulns disclosed in 2023:
    1. XSS (1963 cases reported)
    2. CSRF (1098 cases reported)
    3. Missing Authorization and Authorization bypass (885 cases reported)
    4. SQLi (279 cases reported)
    5. Information Disclosure (98 cases reported)

takeaways

for devs

  • always validate (user) input
  • always sanitize (the final HTML) output
  • always use prepared statements or go for ORMs!
  • don't do everything from scratch, use frameworks!
    • and pleeeeease update them in time!
  • I know, I know, there is no 100% in security, but:
    if you make exceptions from above rules, you really need to know what you are doing!
  • nag your higher-ups about secure coding workshops and a proper SDLC

a note on updates

this one also goes out to ops and management

"WordPress has made it significantly easier to keep plugins and themes updated with a user-friendly auto-update mechanism, and regularly pushes updates for critical vulnerabilities even in cases where the user-facing auto-update mechanism is not enabled. Nonetheless, many sites intentionally and fully disable automatic updates, even for critical security issues, which significantly increases their chances of compromise. If your organization has disabled automatic updates to prevent compatibility issues, ensure that you have a process in place to rapidly review security patches and apply them before they can be targeted." The Wordfence 2023 State of WordPress Security Report, p.19

references

can be found on our course cryptpad
Collection of solutions to exercises and further materials

feel free to add some stuff